Downloading winning-args-corpus to /home/XXXXXXd/.convokit/downloads/winning-args-corpus
Downloading winning-args-corpus from http://zissou.infosci.cornell.edu/convokit/datasets/winning-args-corpus/winning-args-corpus.zip (73.7MB)... Done
[91mWARNING: [0mUtterance text must be a string: text of utterance with ID: t1_cnhre1n has been casted to a string.
[91mWARNING: [0mUtterance text must be a string: text of utterance with ID: t1_cnhs1jf has been casted to a string.
[91mWARNING: [0mUtterance text must be a string: text of utterance with ID: t1_cn7mmnt has been casted to a string.
[91mWARNING: [0mUtterance text must be a string: text of utterance with ID: t1_cn66mck has been casted to a string.
[91mWARNING: [0mUtterance text must be a string: text of utterance with ID: t1_cmt6w97 has been casted to a string.
[91mWARNING: [0mUtterance text must be a string: text of utterance with ID: t1_cmsgxzm has been casted to a string.
[91mWARNING: [0mUtterance text must be a string: text of utterance with ID: t1_cmsitjr has been casted to a string.
[91mWARNING: [0mUtterance text must be a string: text of utterance with ID: t1_cmqqd49 has been casted to a string.
[91mWARNING: [0mUtterance text must be a string: text of utterance with ID: t1_cmqsp80 has been casted to a string.
[91mWARNING: [0mUtterance text must be a string: text of utterance with ID: t1_cmr57l3 has been casted to a string.
[91mWARNING: [0mUtterance text must be a string: text of utterance with ID: t1_cmihu76 has been casted to a string.
[91mWARNING: [0mUtterance text must be a string: text of utterance with ID: t1_cm85qp1 has been casted to a string.
[91mWARNING: [0mUtterance text must be a string: text of utterance with ID: t1_cm6tktt has been casted to a string.
[91mWARNING: [0mUtterance text must be a string: text of utterance with ID: t1_clz6jxt has been casted to a string.
[91mWARNING: [0mUtterance text must be a string: text of utterance with ID: t1_cm09icp has been casted to a string.
[91mWARNING: [0mUtterance text must be a string: text of utterance with ID: t1_cly0oho has been casted to a string.
[91mWARNING: [0mUtterance text must be a string: text of utterance with ID: t1_cly8wzq has been casted to a string.
[91mWARNING: [0mUtterance text must be a string: text of utterance with ID: t1_clxpe30 has been casted to a string.
[91mWARNING: [0mUtterance text must be a string: text of utterance with ID: t1_clv6oas has been casted to a string.
[91mWARNING: [0mUtterance text must be a string: text of utterance with ID: t1_cls8zr2 has been casted to a string.
[91mWARNING: [0mUtterance text must be a string: text of utterance with ID: t1_clj3tcp has been casted to a string.
[91mWARNING: [0mUtterance text must be a string: text of utterance with ID: t1_clhtmr0 has been casted to a string.
[91mWARNING: [0mUtterance text must be a string: text of utterance with ID: t1_cler134 has been casted to a string.
[91mWARNING: [0mUtterance text must be a string: text of utterance with ID: t1_claiask has been casted to a string.
[91mWARNING: [0mUtterance text must be a string: text of utterance with ID: t1_cl96qxj has been casted to a string.
[91mWARNING: [0mUtterance text must be a string: text of utterance with ID: t1_cl92hgd has been casted to a string.
[91mWARNING: [0mUtterance text must be a string: text of utterance with ID: t1_cl92jdc has been casted to a string.
[91mWARNING: [0mUtterance text must be a string: text of utterance with ID: t1_cl7lil7 has been casted to a string.
[91mWARNING: [0mUtterance text must be a string: text of utterance with ID: t1_claaeq7 has been casted to a string.
[91mWARNING: [0mUtterance text must be a string: text of utterance with ID: t1_cl6xkrl has been casted to a string.
[91mWARNING: [0mUtterance text must be a string: text of utterance with ID: t1_cl6yu6h has been casted to a string.
[91mWARNING: [0mUtterance text must be a string: text of utterance with ID: t1_cl6ywwj has been casted to a string.
[91mWARNING: [0mUtterance text must be a string: text of utterance with ID: t1_cl6yxf2 has been casted to a string.
[91mWARNING: [0mUtterance text must be a string: text of utterance with ID: t1_cl6u197 has been casted to a string.
[91mWARNING: [0mUtterance text must be a string: text of utterance with ID: t1_cl65sus has been casted to a string.
[91mWARNING: [0mUtterance text must be a string: text of utterance with ID: t1_cl67wux has been casted to a string.
[91mWARNING: [0mUtterance text must be a string: text of utterance with ID: t1_cl5n0e5 has been casted to a string.
[91mWARNING: [0mUtterance text must be a string: text of utterance with ID: t1_cl55jtf has been casted to a string.
[91mWARNING: [0mUtterance text must be a string: text of utterance with ID: t1_cl53psw has been casted to a string.
[91mWARNING: [0mUtterance text must be a string: text of utterance with ID: t1_cl54hde has been casted to a string.
[91mWARNING: [0mUtterance text must be a string: text of utterance with ID: t1_cl5490t has been casted to a string.
[91mWARNING: [0mUtterance text must be a string: text of utterance with ID: t1_cl38tmp has been casted to a string.
[91mWARNING: [0mUtterance text must be a string: text of utterance with ID: t1_cl2rq86 has been casted to a string.
[91mWARNING: [0mUtterance text must be a string: text of utterance with ID: t1_ckujlyq has been casted to a string.
[91mWARNING: [0mUtterance text must be a string: text of utterance with ID: t1_ckujdib has been casted to a string.
[91mWARNING: [0mUtterance text must be a string: text of utterance with ID: t1_cklsm40 has been casted to a string.
[91mWARNING: [0mUtterance text must be a string: text of utterance with ID: t1_ckmghf2 has been casted to a string.
[91mWARNING: [0mUtterance text must be a string: text of utterance with ID: t1_cknmezc has been casted to a string.
[91mWARNING: [0mUtterance text must be a string: text of utterance with ID: t1_ckj2g0i has been casted to a string.
[91mWARNING: [0mUtterance text must be a string: text of utterance with ID: t1_ck8j9e4 has been casted to a string.
[91mWARNING: [0mUtterance text must be a string: text of utterance with ID: t1_ck8cwb6 has been casted to a string.
[91mWARNING: [0mUtterance text must be a string: text of utterance with ID: t1_cjxlz5l has been casted to a string.
[91mWARNING: [0mUtterance text must be a string: text of utterance with ID: t1_cjntq7s has been casted to a string.
[91mWARNING: [0mUtterance text must be a string: text of utterance with ID: t1_cjnp5hy has been casted to a string.
[91mWARNING: [0mUtterance text must be a string: text of utterance with ID: t1_cjiqn1e has been casted to a string.
[91mWARNING: [0mUtterance text must be a string: text of utterance with ID: t1_cjh7hr0 has been casted to a string.
[91mWARNING: [0mUtterance text must be a string: text of utterance with ID: t1_cjh6yab has been casted to a string.
[91mWARNING: [0mUtterance text must be a string: text of utterance with ID: t1_cjcd62j has been casted to a string.
[91mWARNING: [0mUtterance text must be a string: text of utterance with ID: t1_cj8l1kv has been casted to a string.
[91mWARNING: [0mUtterance text must be a string: text of utterance with ID: t1_cj8zjae has been casted to a string.
[91mWARNING: [0mUtterance text must be a string: text of utterance with ID: t1_cj7yiy4 has been casted to a string.
[91mWARNING: [0mUtterance text must be a string: text of utterance with ID: t1_cj7o50t has been casted to a string.
[91mWARNING: [0mUtterance text must be a string: text of utterance with ID: t1_cj50shk has been casted to a string.
[91mWARNING: [0mUtterance text must be a string: text of utterance with ID: t1_cizh6xb has been casted to a string.
[91mWARNING: [0mUtterance text must be a string: text of utterance with ID: t1_cj0ukb7 has been casted to a string.
[91mWARNING: [0mUtterance text must be a string: text of utterance with ID: t1_ciz9aka has been casted to a string.
[91mWARNING: [0mUtterance text must be a string: text of utterance with ID: t1_cize32v has been casted to a string.
[91mWARNING: [0mUtterance text must be a string: text of utterance with ID: t1_civr4mn has been casted to a string.
[91mWARNING: [0mUtterance text must be a string: text of utterance with ID: t1_cip166t has been casted to a string.
[91mWARNING: [0mUtterance text must be a string: text of utterance with ID: t1_cil3zav has been casted to a string.
[91mWARNING: [0mUtterance text must be a string: text of utterance with ID: t1_cijy9lk has been casted to a string.
[91mWARNING: [0mUtterance text must be a string: text of utterance with ID: t1_cijcups has been casted to a string.
[91mWARNING: [0mUtterance text must be a string: text of utterance with ID: t1_cifu4zp has been casted to a string.
[91mWARNING: [0mUtterance text must be a string: text of utterance with ID: t1_cid30or has been casted to a string.
[91mWARNING: [0mUtterance text must be a string: text of utterance with ID: t1_ci8wo4e has been casted to a string.
[91mWARNING: [0mUtterance text must be a string: text of utterance with ID: t1_ci9fbtw has been casted to a string.
[91mWARNING: [0mUtterance text must be a string: text of utterance with ID: t1_ci2pixf has been casted to a string.
[91mWARNING: [0mUtterance text must be a string: text of utterance with ID: t1_ci2pcku has been casted to a string.
[91mWARNING: [0mUtterance text must be a string: text of utterance with ID: t1_ci2cc99 has been casted to a string.
[91mWARNING: [0mUtterance text must be a string: text of utterance with ID: t1_chz0gqi has been casted to a string.
[91mWARNING: [0mUtterance text must be a string: text of utterance with ID: t1_chw609b has been casted to a string.
[91mWARNING: [0mUtterance text must be a string: text of utterance with ID: t1_chv6j3s has been casted to a string.
[91mWARNING: [0mUtterance text must be a string: text of utterance with ID: t1_chv26as has been casted to a string.
[91mWARNING: [0mUtterance text must be a string: text of utterance with ID: t1_churmo0 has been casted to a string.
[91mWARNING: [0mUtterance text must be a string: text of utterance with ID: t1_chswmlq has been casted to a string.
[91mWARNING: [0mUtterance text must be a string: text of utterance with ID: t1_chsxa75 has been casted to a string.
[91mWARNING: [0mUtterance text must be a string: text of utterance with ID: t1_chnlrqw has been casted to a string.
[91mWARNING: [0mUtterance text must be a string: text of utterance with ID: t1_chl8bsy has been casted to a string.
[91mWARNING: [0mUtterance text must be a string: text of utterance with ID: t1_chiw401 has been casted to a string.
[91mWARNING: [0mUtterance text must be a string: text of utterance with ID: t1_chihwy1 has been casted to a string.
[91mWARNING: [0mUtterance text must be a string: text of utterance with ID: t1_chh0sby has been casted to a string.
[91mWARNING: [0mUtterance text must be a string: text of utterance with ID: t1_chbrqn7 has been casted to a string.
[91mWARNING: [0mUtterance text must be a string: text of utterance with ID: t1_ch7rh7p has been casted to a string.
[91mWARNING: [0mUtterance text must be a string: text of utterance with ID: t1_ch6watf has been casted to a string.
[91mWARNING: [0mUtterance text must be a string: text of utterance with ID: t1_ch6ssui has been casted to a string.
[91mWARNING: [0mUtterance text must be a string: text of utterance with ID: t1_ch61v2p has been casted to a string.
[91mWARNING: [0mUtterance text must be a string: text of utterance with ID: t1_ch3ng8c has been casted to a string.
[91mWARNING: [0mUtterance text must be a string: text of utterance with ID: t1_cgrm70r has been casted to a string.
[91mWARNING: [0mUtterance text must be a string: text of utterance with ID: t1_cgrz99m has been casted to a string.
[91mWARNING: [0mUtterance text must be a string: text of utterance with ID: t1_cgsegx6 has been casted to a string.
[91mWARNING: [0mUtterance text must be a string: text of utterance with ID: t1_cgnj6j9 has been casted to a string.
[91mWARNING: [0mUtterance text must be a string: text of utterance with ID: t1_ch073vt has been casted to a string.
[91mWARNING: [0mUtterance text must be a string: text of utterance with ID: t1_cgkn5ae has been casted to a string.
[91mWARNING: [0mUtterance text must be a string: text of utterance with ID: t1_cgiihu5 has been casted to a string.
[91mWARNING: [0mUtterance text must be a string: text of utterance with ID: t1_cgieyuu has been casted to a string.
[91mWARNING: [0mUtterance text must be a string: text of utterance with ID: t1_cgig23g has been casted to a string.
[91mWARNING: [0mUtterance text must be a string: text of utterance with ID: t1_cgie34t has been casted to a string.
[91mWARNING: [0mUtterance text must be a string: text of utterance with ID: t1_cgh4ifl has been casted to a string.
[91mWARNING: [0mUtterance text must be a string: text of utterance with ID: t1_cggr63w has been casted to a string.
[91mWARNING: [0mUtterance text must be a string: text of utterance with ID: t1_cggd3v7 has been casted to a string.
[91mWARNING: [0mUtterance text must be a string: text of utterance with ID: t1_cggutk5 has been casted to a string.
[91mWARNING: [0mUtterance text must be a string: text of utterance with ID: t1_cgeiql6 has been casted to a string.
[91mWARNING: [0mUtterance text must be a string: text of utterance with ID: t1_cgej6dy has been casted to a string.
[91mWARNING: [0mUtterance text must be a string: text of utterance with ID: t1_cge3th4 has been casted to a string.
[91mWARNING: [0mUtterance text must be a string: text of utterance with ID: t1_cgedp3m has been casted to a string.
[91mWARNING: [0mUtterance text must be a string: text of utterance with ID: t1_cg8kgbj has been casted to a string.
[91mWARNING: [0mUtterance text must be a string: text of utterance with ID: t1_cfwc45l has been casted to a string.
[91mWARNING: [0mUtterance text must be a string: text of utterance with ID: t1_cfv3enu has been casted to a string.
[91mWARNING: [0mUtterance text must be a string: text of utterance with ID: t1_cfpx424 has been casted to a string.
[91mWARNING: [0mUtterance text must be a string: text of utterance with ID: t1_cfl78p1 has been casted to a string.
[91mWARNING: [0mUtterance text must be a string: text of utterance with ID: t1_cfilsz7 has been casted to a string.
[91mWARNING: [0mUtterance text must be a string: text of utterance with ID: t1_cfj8wvg has been casted to a string.
[91mWARNING: [0mUtterance text must be a string: text of utterance with ID: t1_cffznn5 has been casted to a string.
[91mWARNING: [0mUtterance text must be a string: text of utterance with ID: t1_cffo8l7 has been casted to a string.
[91mWARNING: [0mUtterance text must be a string: text of utterance with ID: t1_cfdq2z7 has been casted to a string.
[91mWARNING: [0mUtterance text must be a string: text of utterance with ID: t1_cfc9c4b has been casted to a string.
[91mWARNING: [0mUtterance text must be a string: text of utterance with ID: t1_cfcrvtf has been casted to a string.
[91mWARNING: [0mUtterance text must be a string: text of utterance with ID: t1_cfbxgor has been casted to a string.
[91mWARNING: [0mUtterance text must be a string: text of utterance with ID: t1_cfb4q61 has been casted to a string.
[91mWARNING: [0mUtterance text must be a string: text of utterance with ID: t1_cfb5l7x has been casted to a string.
[91mWARNING: [0mUtterance text must be a string: text of utterance with ID: t1_cf99z8w has been casted to a string.
[91mWARNING: [0mUtterance text must be a string: text of utterance with ID: t1_cf2h04o has been casted to a string.
[91mWARNING: [0mUtterance text must be a string: text of utterance with ID: t1_ceptv5v has been casted to a string.
[91mWARNING: [0mUtterance text must be a string: text of utterance with ID: t1_ceo2hjj has been casted to a string.
[91mWARNING: [0mUtterance text must be a string: text of utterance with ID: t1_ceorou4 has been casted to a string.
[91mWARNING: [0mUtterance text must be a string: text of utterance with ID: t1_cenyp1x has been casted to a string.
[91mWARNING: [0mUtterance text must be a string: text of utterance with ID: t1_cemye2y has been casted to a string.
[91mWARNING: [0mUtterance text must be a string: text of utterance with ID: t1_celsimn has been casted to a string.
[91mWARNING: [0mUtterance text must be a string: text of utterance with ID: t1_ceehosv has been casted to a string.
[91mWARNING: [0mUtterance text must be a string: text of utterance with ID: t1_cebrznc has been casted to a string.
[91mWARNING: [0mUtterance text must be a string: text of utterance with ID: t1_ceconce has been casted to a string.
[91mWARNING: [0mUtterance text must be a string: text of utterance with ID: t1_cebnxzk has been casted to a string.
[91mWARNING: [0mUtterance text must be a string: text of utterance with ID: t1_cebd6e3 has been casted to a string.
[91mWARNING: [0mUtterance text must be a string: text of utterance with ID: t1_cec3nwt has been casted to a string.
[91mWARNING: [0mUtterance text must be a string: text of utterance with ID: t1_chv7nyk has been casted to a string.
[91mWARNING: [0mUtterance text must be a string: text of utterance with ID: t1_cedktpa has been casted to a string.
[91mWARNING: [0mUtterance text must be a string: text of utterance with ID: t1_ce9dob3 has been casted to a string.
[91mWARNING: [0mUtterance text must be a string: text of utterance with ID: t1_ce5q8tj has been casted to a string.
[91mWARNING: [0mUtterance text must be a string: text of utterance with ID: t1_ce4gt8u has been casted to a string.
[91mWARNING: [0mUtterance text must be a string: text of utterance with ID: t1_ce0p904 has been casted to a string.
[91mWARNING: [0mUtterance text must be a string: text of utterance with ID: t1_cdosehh has been casted to a string.
[91mWARNING: [0mUtterance text must be a string: text of utterance with ID: t1_cdiarjx has been casted to a string.
[91mWARNING: [0mUtterance text must be a string: text of utterance with ID: t1_cda83lr has been casted to a string.
[91mWARNING: [0mUtterance text must be a string: text of utterance with ID: t1_cd759lq has been casted to a string.
[91mWARNING: [0mUtterance text must be a string: text of utterance with ID: t1_cd4pzb7 has been casted to a string.
[91mWARNING: [0mUtterance text must be a string: text of utterance with ID: t1_cd4p7s1 has been casted to a string.
[91mWARNING: [0mUtterance text must be a string: text of utterance with ID: t1_cd4zoy7 has been casted to a string.
[91mWARNING: [0mUtterance text must be a string: text of utterance with ID: t1_cdn4qk9 has been casted to a string.
[91mWARNING: [0mUtterance text must be a string: text of utterance with ID: t1_ccue3mg has been casted to a string.
[91mWARNING: [0mUtterance text must be a string: text of utterance with ID: t1_ccphf6t has been casted to a string.
[91mWARNING: [0mUtterance text must be a string: text of utterance with ID: t1_ccphsld has been casted to a string.
[91mWARNING: [0mUtterance text must be a string: text of utterance with ID: t1_ccmc12f has been casted to a string.
[91mWARNING: [0mUtterance text must be a string: text of utterance with ID: t1_ccjdtdf has been casted to a string.
[91mWARNING: [0mUtterance text must be a string: text of utterance with ID: t1_cciu2c8 has been casted to a string.
[91mWARNING: [0mUtterance text must be a string: text of utterance with ID: t1_ccizwjs has been casted to a string.
[91mWARNING: [0mUtterance text must be a string: text of utterance with ID: t1_ccivs3y has been casted to a string.
[91mWARNING: [0mUtterance text must be a string: text of utterance with ID: t1_ccj0iqr has been casted to a string.
[91mWARNING: [0mUtterance text must be a string: text of utterance with ID: t1_cciupg2 has been casted to a string.
[91mWARNING: [0mUtterance text must be a string: text of utterance with ID: t1_cch8ibq has been casted to a string.
[91mWARNING: [0mUtterance text must be a string: text of utterance with ID: t1_ccgulma has been casted to a string.
[91mWARNING: [0mUtterance text must be a string: text of utterance with ID: t1_ccgulwv has been casted to a string.
[91mWARNING: [0mUtterance text must be a string: text of utterance with ID: t1_ccgnj8i has been casted to a string.
[91mWARNING: [0mUtterance text must be a string: text of utterance with ID: t1_cch2tad has been casted to a string.
[91mWARNING: [0mUtterance text must be a string: text of utterance with ID: t1_cch9i4l has been casted to a string.
[91mWARNING: [0mUtterance text must be a string: text of utterance with ID: t1_cch9wwe has been casted to a string.
[91mWARNING: [0mUtterance text must be a string: text of utterance with ID: t1_ccgvmn6 has been casted to a string.
[91mWARNING: [0mUtterance text must be a string: text of utterance with ID: t1_cce35j0 has been casted to a string.
[91mWARNING: [0mUtterance text must be a string: text of utterance with ID: t1_ccdamor has been casted to a string.
[91mWARNING: [0mUtterance text must be a string: text of utterance with ID: t1_cce8w7z has been casted to a string.
[91mWARNING: [0mUtterance text must be a string: text of utterance with ID: t1_cce42qt has been casted to a string.
[91mWARNING: [0mUtterance text must be a string: text of utterance with ID: t1_cc9x38t has been casted to a string.
[91mWARNING: [0mUtterance text must be a string: text of utterance with ID: t1_cc302nb has been casted to a string.
[91mWARNING: [0mUtterance text must be a string: text of utterance with ID: t1_cbyfuze has been casted to a string.
[91mWARNING: [0mUtterance text must be a string: text of utterance with ID: t1_cbvq8lh has been casted to a string.
[91mWARNING: [0mUtterance text must be a string: text of utterance with ID: t1_cbrec2x has been casted to a string.
[91mWARNING: [0mUtterance text must be a string: text of utterance with ID: t1_cbrmg6l has been casted to a string.
[91mWARNING: [0mUtterance text must be a string: text of utterance with ID: t1_cbq8ij7 has been casted to a string.
[91mWARNING: [0mUtterance text must be a string: text of utterance with ID: t1_cbp1a4b has been casted to a string.
[91mWARNING: [0mUtterance text must be a string: text of utterance with ID: t1_cboztg2 has been casted to a string.
[91mWARNING: [0mUtterance text must be a string: text of utterance with ID: t1_cbou2b5 has been casted to a string.
[91mWARNING: [0mUtterance text must be a string: text of utterance with ID: t1_cbpau3m has been casted to a string.
[91mWARNING: [0mUtterance text must be a string: text of utterance with ID: t1_cbokj1o has been casted to a string.
[91mWARNING: [0mUtterance text must be a string: text of utterance with ID: t1_cboz1er has been casted to a string.
[91mWARNING: [0mUtterance text must be a string: text of utterance with ID: t1_cbmr775 has been casted to a string.
[91mWARNING: [0mUtterance text must be a string: text of utterance with ID: t1_cblsags has been casted to a string.
[91mWARNING: [0mUtterance text must be a string: text of utterance with ID: t1_cbqpqwi has been casted to a string.
[91mWARNING: [0mUtterance text must be a string: text of utterance with ID: t1_cb9lo66 has been casted to a string.
[91mWARNING: [0mUtterance text must be a string: text of utterance with ID: t1_cb9lrsq has been casted to a string.
[91mWARNING: [0mUtterance text must be a string: text of utterance with ID: t1_cb9zx65 has been casted to a string.
[91mWARNING: [0mUtterance text must be a string: text of utterance with ID: t1_cb8beut has been casted to a string.
[91mWARNING: [0mUtterance text must be a string: text of utterance with ID: t1_cb7zk0m has been casted to a string.
[91mWARNING: [0mUtterance text must be a string: text of utterance with ID: t1_cb6c4sq has been casted to a string.
[91mWARNING: [0mUtterance text must be a string: text of utterance with ID: t1_cb33ath has been casted to a string.
[91mWARNING: [0mUtterance text must be a string: text of utterance with ID: t1_cb1kfhi has been casted to a string.
[91mWARNING: [0mUtterance text must be a string: text of utterance with ID: t1_cb0ahbi has been casted to a string.
[91mWARNING: [0mUtterance text must be a string: text of utterance with ID: t1_cb0bzr4 has been casted to a string.
[91mWARNING: [0mUtterance text must be a string: text of utterance with ID: t1_cawc0b9 has been casted to a string.
[91mWARNING: [0mUtterance text must be a string: text of utterance with ID: t1_cawjc4y has been casted to a string.
[91mWARNING: [0mUtterance text must be a string: text of utterance with ID: t1_caxt2vf has been casted to a string.
[91mWARNING: [0mUtterance text must be a string: text of utterance with ID: t1_caukmyp has been casted to a string.
[91mWARNING: [0mUtterance text must be a string: text of utterance with ID: t1_caurnnc has been casted to a string.
[91mWARNING: [0mUtterance text must be a string: text of utterance with ID: t1_cavlgpq has been casted to a string.
[91mWARNING: [0mUtterance text must be a string: text of utterance with ID: t1_caulblj has been casted to a string.
[91mWARNING: [0mUtterance text must be a string: text of utterance with ID: t1_cauridr has been casted to a string.
[91mWARNING: [0mUtterance text must be a string: text of utterance with ID: t1_casyucd has been casted to a string.
[91mWARNING: [0mUtterance text must be a string: text of utterance with ID: t1_catl342 has been casted to a string.
[91mWARNING: [0mUtterance text must be a string: text of utterance with ID: t1_camzwda has been casted to a string.
[91mWARNING: [0mUtterance text must be a string: text of utterance with ID: t1_camssul has been casted to a string.
[91mWARNING: [0mUtterance text must be a string: text of utterance with ID: t1_can6u3l has been casted to a string.
[91mWARNING: [0mUtterance text must be a string: text of utterance with ID: t1_calik3u has been casted to a string.
[91mWARNING: [0mUtterance text must be a string: text of utterance with ID: t1_caljueg has been casted to a string.
[91mWARNING: [0mUtterance text must be a string: text of utterance with ID: t1_cajb0py has been casted to a string.
[91mWARNING: [0mUtterance text must be a string: text of utterance with ID: t1_cajbvqf has been casted to a string.
[91mWARNING: [0mUtterance text must be a string: text of utterance with ID: t1_caajlk6 has been casted to a string.
[91mWARNING: [0mUtterance text must be a string: text of utterance with ID: t1_caachmu has been casted to a string.
[91mWARNING: [0mUtterance text must be a string: text of utterance with ID: t1_cabc5l7 has been casted to a string.
[91mWARNING: [0mUtterance text must be a string: text of utterance with ID: t1_c9xkwd1 has been casted to a string.
[91mWARNING: [0mUtterance text must be a string: text of utterance with ID: t1_c9za8qk has been casted to a string.
[91mWARNING: [0mUtterance text must be a string: text of utterance with ID: t1_c9x7xos has been casted to a string.
[91mWARNING: [0mUtterance text must be a string: text of utterance with ID: t1_c9zr42i has been casted to a string.
[91mWARNING: [0mUtterance text must be a string: text of utterance with ID: t1_c9tlmhq has been casted to a string.
[91mWARNING: [0mUtterance text must be a string: text of utterance with ID: t1_c9r6a05 has been casted to a string.
[91mWARNING: [0mUtterance text must be a string: text of utterance with ID: t1_ca0tpgp has been casted to a string.
[91mWARNING: [0mUtterance text must be a string: text of utterance with ID: t1_c9r7py2 has been casted to a string.
[91mWARNING: [0mUtterance text must be a string: text of utterance with ID: t1_c9rnobm has been casted to a string.
[91mWARNING: [0mUtterance text must be a string: text of utterance with ID: t1_c9rp0so has been casted to a string.
[91mWARNING: [0mUtterance text must be a string: text of utterance with ID: t1_c9qubp4 has been casted to a string.
[91mWARNING: [0mUtterance text must be a string: text of utterance with ID: t1_c9qwaar has been casted to a string.
[91mWARNING: [0mUtterance text must be a string: text of utterance with ID: t1_c9pfs2o has been casted to a string.
[91mWARNING: [0mUtterance text must be a string: text of utterance with ID: t1_c9lc68q has been casted to a string.
[91mWARNING: [0mUtterance text must be a string: text of utterance with ID: t1_c9jlurv has been casted to a string.
[91mWARNING: [0mUtterance text must be a string: text of utterance with ID: t1_c9hall9 has been casted to a string.
[91mWARNING: [0mUtterance text must be a string: text of utterance with ID: t1_c9gkb83 has been casted to a string.
[91mWARNING: [0mUtterance text must be a string: text of utterance with ID: t1_c9bvjlq has been casted to a string.
[91mWARNING: [0mUtterance text must be a string: text of utterance with ID: t1_c98s9ip has been casted to a string.
[91mWARNING: [0mUtterance text must be a string: text of utterance with ID: t1_c98o57y has been casted to a string.
[91mWARNING: [0mUtterance text must be a string: text of utterance with ID: t1_c99ahf3 has been casted to a string.
[91mWARNING: [0mUtterance text must be a string: text of utterance with ID: t1_c990rx6 has been casted to a string.
[91mWARNING: [0mUtterance text must be a string: text of utterance with ID: t1_c97eoi9 has been casted to a string.
[91mWARNING: [0mUtterance text must be a string: text of utterance with ID: t1_c97acob has been casted to a string.
[91mWARNING: [0mUtterance text must be a string: text of utterance with ID: t1_c97ah5c has been casted to a string.
[91mWARNING: [0mUtterance text must be a string: text of utterance with ID: t1_c95q7s3 has been casted to a string.
[91mWARNING: [0mUtterance text must be a string: text of utterance with ID: t1_c95kdch has been casted to a string.
[91mWARNING: [0mUtterance text must be a string: text of utterance with ID: t1_c95l9ml has been casted to a string.
[91mWARNING: [0mUtterance text must be a string: text of utterance with ID: t1_cqzqi7q has been casted to a string.
[91mWARNING: [0mUtterance text must be a string: text of utterance with ID: t1_cqzmedm has been casted to a string.
[91mWARNING: [0mUtterance text must be a string: text of utterance with ID: t1_cr13cnd has been casted to a string.
[91mWARNING: [0mUtterance text must be a string: text of utterance with ID: t1_cqzr1lp has been casted to a string.
[91mWARNING: [0mUtterance text must be a string: text of utterance with ID: t1_cqyom18 has been casted to a string.
[91mWARNING: [0mUtterance text must be a string: text of utterance with ID: t1_cqybmxe has been casted to a string.
[91mWARNING: [0mUtterance text must be a string: text of utterance with ID: t1_cqy7ahu has been casted to a string.
[91mWARNING: [0mUtterance text must be a string: text of utterance with ID: t1_cqybfct has been casted to a string.
[91mWARNING: [0mUtterance text must be a string: text of utterance with ID: t1_cqyuen2 has been casted to a string.
[91mWARNING: [0mUtterance text must be a string: text of utterance with ID: t1_cqy83y1 has been casted to a string.
[91mWARNING: [0mUtterance text must be a string: text of utterance with ID: t1_cqxoxi5 has been casted to a string.
[91mWARNING: [0mUtterance text must be a string: text of utterance with ID: t1_cqn9nyq has been casted to a string.
[91mWARNING: [0mUtterance text must be a string: text of utterance with ID: t1_cqnhdk2 has been casted to a string.
[91mWARNING: [0mUtterance text must be a string: text of utterance with ID: t1_cqn3mch has been casted to a string.
[91mWARNING: [0mUtterance text must be a string: text of utterance with ID: t1_cqn7pac has been casted to a string.
[91mWARNING: [0mUtterance text must be a string: text of utterance with ID: t1_cqmw3di has been casted to a string.
[91mWARNING: [0mUtterance text must be a string: text of utterance with ID: t1_cqf8ryl has been casted to a string.
[91mWARNING: [0mUtterance text must be a string: text of utterance with ID: t1_cqf8mpf has been casted to a string.
[91mWARNING: [0mUtterance text must be a string: text of utterance with ID: t1_cqfb5xa has been casted to a string.
[91mWARNING: [0mUtterance text must be a string: text of utterance with ID: t1_cqfhjkq has been casted to a string.
[91mWARNING: [0mUtterance text must be a string: text of utterance with ID: t1_cqgarpy has been casted to a string.
[91mWARNING: [0mUtterance text must be a string: text of utterance with ID: t1_cqdccrd has been casted to a string.
[91mWARNING: [0mUtterance text must be a string: text of utterance with ID: t1_cqctlqr has been casted to a string.
[91mWARNING: [0mUtterance text must be a string: text of utterance with ID: t1_cqcthds has been casted to a string.
[91mWARNING: [0mUtterance text must be a string: text of utterance with ID: t1_cqkelp2 has been casted to a string.
[91mWARNING: [0mUtterance text must be a string: text of utterance with ID: t1_cqd8j90 has been casted to a string.
[91mWARNING: [0mUtterance text must be a string: text of utterance with ID: t1_cqd8a8j has been casted to a string.
[91mWARNING: [0mUtterance text must be a string: text of utterance with ID: t1_cqcrdrw has been casted to a string.
[91mWARNING: [0mUtterance text must be a string: text of utterance with ID: t1_cq8xz69 has been casted to a string.
[91mWARNING: [0mUtterance text must be a string: text of utterance with ID: t1_cq97x51 has been casted to a string.
[91mWARNING: [0mUtterance text must be a string: text of utterance with ID: t1_cq8zb08 has been casted to a string.
[91mWARNING: [0mUtterance text must be a string: text of utterance with ID: t1_cq959on has been casted to a string.
[91mWARNING: [0mUtterance text must be a string: text of utterance with ID: t1_cq9ci0g has been casted to a string.
[91mWARNING: [0mUtterance text must be a string: text of utterance with ID: t1_cq97u6v has been casted to a string.
[91mWARNING: [0mUtterance text must be a string: text of utterance with ID: t1_cptu8ww has been casted to a string.
[91mWARNING: [0mUtterance text must be a string: text of utterance with ID: t1_cpstv77 has been casted to a string.
[91mWARNING: [0mUtterance text must be a string: text of utterance with ID: t1_cpsxhki has been casted to a string.
[91mWARNING: [0mUtterance text must be a string: text of utterance with ID: t1_cpswzfn has been casted to a string.
[91mWARNING: [0mUtterance text must be a string: text of utterance with ID: t1_cpuguvq has been casted to a string.
[91mWARNING: [0mUtterance text must be a string: text of utterance with ID: t1_cpvfq3o has been casted to a string.
[91mWARNING: [0mUtterance text must be a string: text of utterance with ID: t1_cptriew has been casted to a string.
[91mWARNING: [0mUtterance text must be a string: text of utterance with ID: t1_cpsxudr has been casted to a string.
[91mWARNING: [0mUtterance text must be a string: text of utterance with ID: t1_cpszfeh has been casted to a string.
[91mWARNING: [0mUtterance text must be a string: text of utterance with ID: t1_cpqft77 has been casted to a string.
[91mWARNING: [0mUtterance text must be a string: text of utterance with ID: t1_cpqfs0u has been casted to a string.
[91mWARNING: [0mUtterance text must be a string: text of utterance with ID: t1_cprk62o has been casted to a string.
[91mWARNING: [0mUtterance text must be a string: text of utterance with ID: t1_cpnulph has been casted to a string.
[91mWARNING: [0mUtterance text must be a string: text of utterance with ID: t1_cpnn77o has been casted to a string.
[91mWARNING: [0mUtterance text must be a string: text of utterance with ID: t1_cpnju02 has been casted to a string.
[91mWARNING: [0mUtterance text must be a string: text of utterance with ID: t1_cpn5p75 has been casted to a string.
[91mWARNING: [0mUtterance text must be a string: text of utterance with ID: t1_cpni19v has been casted to a string.
[91mWARNING: [0mUtterance text must be a string: text of utterance with ID: t1_cpo4x6m has been casted to a string.
[91mWARNING: [0mUtterance text must be a string: text of utterance with ID: t1_cpbijg5 has been casted to a string.
[91mWARNING: [0mUtterance text must be a string: text of utterance with ID: t1_cpbfv95 has been casted to a string.
[91mWARNING: [0mUtterance text must be a string: text of utterance with ID: t1_coz6meu has been casted to a string.
[91mWARNING: [0mUtterance text must be a string: text of utterance with ID: t1_coz24er has been casted to a string.
[91mWARNING: [0mUtterance text must be a string: text of utterance with ID: t1_coz3omg has been casted to a string.
[91mWARNING: [0mUtterance text must be a string: text of utterance with ID: t1_copv0tz has been casted to a string.
[91mWARNING: [0mUtterance text must be a string: text of utterance with ID: t1_cor1wk2 has been casted to a string.
[91mWARNING: [0mUtterance text must be a string: text of utterance with ID: t1_coq0wpg has been casted to a string.
[91mWARNING: [0mUtterance text must be a string: text of utterance with ID: t1_coqe7nl has been casted to a string.
[91mWARNING: [0mUtterance text must be a string: text of utterance with ID: t1_cokywqo has been casted to a string.
[91mWARNING: [0mUtterance text must be a string: text of utterance with ID: t1_cokwcae has been casted to a string.
[91mWARNING: [0mUtterance text must be a string: text of utterance with ID: t1_cokx8ik has been casted to a string.
[91mWARNING: [0mUtterance text must be a string: text of utterance with ID: t1_cokyzdf has been casted to a string.
[91mWARNING: [0mUtterance text must be a string: text of utterance with ID: t1_col2sjk has been casted to a string.
[91mWARNING: [0mUtterance text must be a string: text of utterance with ID: t1_cokxscn has been casted to a string.
[91mWARNING: [0mUtterance text must be a string: text of utterance with ID: t1_cojw16m has been casted to a string.
[91mWARNING: [0mUtterance text must be a string: text of utterance with ID: t1_coezsna has been casted to a string.
[91mWARNING: [0mUtterance text must be a string: text of utterance with ID: t1_cof1lb9 has been casted to a string.
[91mWARNING: [0mUtterance text must be a string: text of utterance with ID: t1_cohpfyt has been casted to a string.
[91mWARNING: [0mUtterance text must be a string: text of utterance with ID: t1_cof4o1h has been casted to a string.
[91mWARNING: [0mUtterance text must be a string: text of utterance with ID: t1_coeyzd4 has been casted to a string.
[91mWARNING: [0mUtterance text must be a string: text of utterance with ID: t1_coai23q has been casted to a string.
[91mWARNING: [0mUtterance text must be a string: text of utterance with ID: t1_coa8l6z has been casted to a string.
[91mWARNING: [0mUtterance text must be a string: text of utterance with ID: t1_coa80dz has been casted to a string.
[91mWARNING: [0mUtterance text must be a string: text of utterance with ID: t1_coa8l84 has been casted to a string.
[91mWARNING: [0mUtterance text must be a string: text of utterance with ID: t1_co68zqm has been casted to a string.
[91mWARNING: [0mUtterance text must be a string: text of utterance with ID: t1_co6a4yb has been casted to a string.
[91mWARNING: [0mUtterance text must be a string: text of utterance with ID: t1_co6ef8m has been casted to a string.
[91mWARNING: [0mUtterance text must be a string: text of utterance with ID: t1_co6ba44 has been casted to a string.
[91mWARNING: [0mUtterance text must be a string: text of utterance with ID: t1_co6rpqr has been casted to a string.
[91mWARNING: [0mUtterance text must be a string: text of utterance with ID: t1_co69x71 has been casted to a string.
[91mWARNING: [0mUtterance text must be a string: text of utterance with ID: t1_co6a13x has been casted to a string.
[91mWARNING: [0mUtterance text must be a string: text of utterance with ID: t1_cnq3mwr has been casted to a string.
[91mWARNING: [0mUtterance text must be a string: text of utterance with ID: t1_cnq4ycs has been casted to a string.
[91mWARNING: [0mUtterance text must be a string: text of utterance with ID: t1_cnqs4bp has been casted to a string.
[91mWARNING: [0mUtterance text must be a string: text of utterance with ID: t1_cunpy0j has been casted to a string.
[91mWARNING: [0mUtterance text must be a string: text of utterance with ID: t1_cuior6e has been casted to a string.
[91mWARNING: [0mUtterance text must be a string: text of utterance with ID: t1_cufvp2u has been casted to a string.
[91mWARNING: [0mUtterance text must be a string: text of utterance with ID: t1_cu7mdqo has been casted to a string.
[91mWARNING: [0mUtterance text must be a string: text of utterance with ID: t1_cu7xcz4 has been casted to a string.
[91mWARNING: [0mUtterance text must be a string: text of utterance with ID: t1_cu5cv9a has been casted to a string.
[91mWARNING: [0mUtterance text must be a string: text of utterance with ID: t1_cu3h1wa has been casted to a string.
[91mWARNING: [0mUtterance text must be a string: text of utterance with ID: t1_cu272pa has been casted to a string.
[91mWARNING: [0mUtterance text must be a string: text of utterance with ID: t1_cu0qopb has been casted to a string.
[91mWARNING: [0mUtterance text must be a string: text of utterance with ID: t1_ctzro35 has been casted to a string.
[91mWARNING: [0mUtterance text must be a string: text of utterance with ID: t1_ctxzpb4 has been casted to a string.
[91mWARNING: [0mUtterance text must be a string: text of utterance with ID: t1_ctwh70b has been casted to a string.
[91mWARNING: [0mUtterance text must be a string: text of utterance with ID: t1_ctqswb6 has been casted to a string.
[91mWARNING: [0mUtterance text must be a string: text of utterance with ID: t1_ctqjqcd has been casted to a string.
[91mWARNING: [0mUtterance text must be a string: text of utterance with ID: t1_ctoncx7 has been casted to a string.
[91mWARNING: [0mUtterance text must be a string: text of utterance with ID: t1_ctq5xed has been casted to a string.
[91mWARNING: [0mUtterance text must be a string: text of utterance with ID: t1_ctjjz2v has been casted to a string.
[91mWARNING: [0mUtterance text must be a string: text of utterance with ID: t1_cti68mr has been casted to a string.
[91mWARNING: [0mUtterance text must be a string: text of utterance with ID: t1_ctiqywc has been casted to a string.
[91mWARNING: [0mUtterance text must be a string: text of utterance with ID: t1_cthu9hx has been casted to a string.
[91mWARNING: [0mUtterance text must be a string: text of utterance with ID: t1_ctdqpne has been casted to a string.
[91mWARNING: [0mUtterance text must be a string: text of utterance with ID: t1_ctdkusf has been casted to a string.
[91mWARNING: [0mUtterance text must be a string: text of utterance with ID: t1_ctdhx1w has been casted to a string.
[91mWARNING: [0mUtterance text must be a string: text of utterance with ID: t1_ctcfj2v has been casted to a string.
[91mWARNING: [0mUtterance text must be a string: text of utterance with ID: t1_ct5ul8m has been casted to a string.
[91mWARNING: [0mUtterance text must be a string: text of utterance with ID: t1_ct5m7v4 has been casted to a string.
[91mWARNING: [0mUtterance text must be a string: text of utterance with ID: t1_ct4klhi has been casted to a string.
[91mWARNING: [0mUtterance text must be a string: text of utterance with ID: t1_ct3sgfd has been casted to a string.
[91mWARNING: [0mUtterance text must be a string: text of utterance with ID: t1_ct5bruu has been casted to a string.
[91mWARNING: [0mUtterance text must be a string: text of utterance with ID: t1_ct1rmby has been casted to a string.
[91mWARNING: [0mUtterance text must be a string: text of utterance with ID: t1_cstl2de has been casted to a string.
[91mWARNING: [0mUtterance text must be a string: text of utterance with ID: t1_cstboz7 has been casted to a string.
[91mWARNING: [0mUtterance text must be a string: text of utterance with ID: t1_cstbdw5 has been casted to a string.
[91mWARNING: [0mUtterance text must be a string: text of utterance with ID: t1_ct7dtjh has been casted to a string.
[91mWARNING: [0mUtterance text must be a string: text of utterance with ID: t1_csk5kbs has been casted to a string.
[91mWARNING: [0mUtterance text must be a string: text of utterance with ID: t1_ctam9yz has been casted to a string.
[91mWARNING: [0mUtterance text must be a string: text of utterance with ID: t1_csh3dqw has been casted to a string.
[91mWARNING: [0mUtterance text must be a string: text of utterance with ID: t1_csislrs has been casted to a string.
[91mWARNING: [0mUtterance text must be a string: text of utterance with ID: t1_csed3sw has been casted to a string.
[91mWARNING: [0mUtterance text must be a string: text of utterance with ID: t1_ctalrap has been casted to a string.
[91mWARNING: [0mUtterance text must be a string: text of utterance with ID: t1_csernqu has been casted to a string.
[91mWARNING: [0mUtterance text must be a string: text of utterance with ID: t1_csdn1ui has been casted to a string.
[91mWARNING: [0mUtterance text must be a string: text of utterance with ID: t1_cs8mmtf has been casted to a string.
[91mWARNING: [0mUtterance text must be a string: text of utterance with ID: t1_cs4re28 has been casted to a string.
[91mWARNING: [0mUtterance text must be a string: text of utterance with ID: t1_ctb2nf1 has been casted to a string.
[91mWARNING: [0mUtterance text must be a string: text of utterance with ID: t1_cs1tl9q has been casted to a string.
[91mWARNING: [0mUtterance text must be a string: text of utterance with ID: t1_crul067 has been casted to a string.
[91mWARNING: [0mUtterance text must be a string: text of utterance with ID: t1_crusrpd has been casted to a string.
[91mWARNING: [0mUtterance text must be a string: text of utterance with ID: t1_crjkijn has been casted to a string.
[91mWARNING: [0mUtterance text must be a string: text of utterance with ID: t1_cre72f1 has been casted to a string.
[91mWARNING: [0mUtterance text must be a string: text of utterance with ID: t1_creewvt has been casted to a string.
[91mWARNING: [0mUtterance text must be a string: text of utterance with ID: t1_creh7rb has been casted to a string.
[91mWARNING: [0mUtterance text must be a string: text of utterance with ID: t1_crctrfy has been casted to a string.
[91mWARNING: [0mUtterance text must be a string: text of utterance with ID: t1_crbkihy has been casted to a string.
[91mWARNING: [0mUtterance text must be a string: text of utterance with ID: t1_cra0ei4 has been casted to a string.
Number of threads in dataset: 293297
--------------EPOCH 1-------------
Test Accuracy: tensor(0.0082, device='cuda:0')
Loss: tensor(0.6258, device='cuda:0', grad_fn=<NllLossBackward>)
Saving model at iteration: 0
Loss: tensor(0.8222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.8758, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.8473, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.8105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.4682, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.6286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(1.2344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.6911, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.7180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.6344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.7102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.6265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.7611, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.6323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.6216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.7104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.5118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.8378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.5275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.5999, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.6236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.5542, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.6595, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.7655, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.6575, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.5880, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.6085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.6748, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.7786, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.6182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.5992, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.5581, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.7803, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.5554, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.5736, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.8896, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.4509, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.4181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.7361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.7277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.6856, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.4747, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.5186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.6654, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.5336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.3390, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.5919, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.4191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.5416, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.6295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.3684, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.4060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.6375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(1.2103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.5405, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.6790, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.4870, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.5343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.5266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.4153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.4126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.4520, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.3099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.5518, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.3499, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.6003, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.4239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.4239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.5133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.5659, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.5180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.3526, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.3452, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.6051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.5779, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.4830, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.3366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.5874, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.4624, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.3586, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.5541, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.4659, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.4496, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.3437, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.4507, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.4414, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.4515, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.5187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.6559, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.5011, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.4704, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.4003, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.9634, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.4849, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.4769, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.4463, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.8966, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.7677, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.3251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.2982, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.5184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.3849, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.3727, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.4640, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.7852, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.3244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.4024, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.2731, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.3597, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.3425, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.3308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.5380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.4470, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.4183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.7201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.3266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.3802, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.3133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.4819, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.4395, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.4528, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.4296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.2524, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.4129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.4853, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.3249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.4648, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.5940, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.3869, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.3420, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.4993, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.3798, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.4323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.3719, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.2504, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.3125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.4052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.9316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.2764, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.3560, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.2765, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.9353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.4068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.3552, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.3920, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.3631, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.2925, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.4853, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.2843, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.2483, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.3917, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.2584, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.2969, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.3295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.3735, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.2926, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.4426, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.2820, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.2384, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.4979, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.5480, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.6153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.7609, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.3274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.3129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.3523, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.3393, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.3011, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.2888, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.3817, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.9556, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.3702, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.3208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.2378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.2522, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.2656, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.3082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.3583, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.2600, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.3166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.3756, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.2506, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.2136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.3135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.3248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.2446, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.3155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.3079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.3630, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.3101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.2387, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.5975, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.3119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.2350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.2699, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.2311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.2565, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.2232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.2442, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.2491, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.2282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.2736, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.3493, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.2674, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.3721, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.5460, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.3288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.4003, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.2174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.2221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1872, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.2559, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1446, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.2462, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.3378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.4075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.3102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.8064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.2208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.3074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.3680, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1918, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.3480, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.2855, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1879, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.3473, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.2944, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.3081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.2717, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1985, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.3849, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.3874, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.3465, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.2363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.2777, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.2385, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.2938, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.2097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.2767, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.2541, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.2169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.2346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.2369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.3302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.2593, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.2199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.4321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.5061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.3096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.2711, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.2353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.3175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.3667, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.2592, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.3269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.2768, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1721, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.2624, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.2415, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1766, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.9692, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.5838, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.2531, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.4380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1436, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1612, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.2714, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.2477, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.2757, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.4000, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.2191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.2595, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.5995, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.2030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1687, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.2472, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.2701, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.2442, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.2184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.2054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.2143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.2134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1509, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1630, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.2072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.2701, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1833, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.3952, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1825, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.2389, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.2705, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1710, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1522, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.3070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.2124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.2046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.2567, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1935, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.2060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1452, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1576, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1899, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1610, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1618, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.2124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1822, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.2364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1843, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.2218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.2674, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.2013, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1992, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1460, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.2211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1569, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.2029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1876, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1645, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.2777, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1971, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1913, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1861, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.2078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.2406, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1807, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.2369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.3030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.2265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1675, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.2706, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.2591, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.2520, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.3609, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1451, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.4179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1473, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.2123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.2058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.2954, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.2776, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.2282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1382, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.2155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.2252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1769, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1941, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1591, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.2375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1747, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.2106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.2609, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1714, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.2086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.2757, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.2048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1443, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.2211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.2596, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1722, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.2966, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.4314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.5652, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1568, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.2793, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.2538, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.2723, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1465, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.5526, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1667, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.2372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.2950, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1713, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.3224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.3049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.2027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1568, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1509, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1608, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.3770, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.3110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1753, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1949, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1764, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1776, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.2796, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1882, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.2449, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1806, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1562, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1773, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1810, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1626, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.2193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.2249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1597, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1593, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1916, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.5016, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.2451, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.2002, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1457, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.2053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.2315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1547, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.2421, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.3503, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.2097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.2294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1396, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1506, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1503, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1782, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.2414, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1531, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1803, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1482, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.2417, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1634, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.2312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.2152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1646, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1671, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.2578, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1993, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.2459, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.2141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1982, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.3249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1650, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.2071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1683, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1800, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.2306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1847, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1729, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.2371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.2092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1664, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.2348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1538, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.2143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1658, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1945, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1790, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1490, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.4238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.3172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.2639, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.2605, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1896, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1508, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.2343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1672, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.2535, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1588, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1779, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1840, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1583, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1397, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1725, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1796, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1788, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.2069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1764, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1970, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1511, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1436, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1597, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1585, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1568, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.2053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.2076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1933, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.6086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1775, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1660, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1649, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.2333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.4462, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.3338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1674, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1685, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0958, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1464, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1910, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.2006, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1799, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1636, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1777, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1505, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1877, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1993, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.4104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1755, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.2136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.2038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1505, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.2579, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1565, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1477, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.2095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1503, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.2053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(1.2105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1815, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1407, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1864, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.3246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.2064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1828, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1963, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0928, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0766, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1514, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1456, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.3333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1824, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1784, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1694, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1989, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1794, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1754, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1458, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.2208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.3376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1701, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1867, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1760, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1827, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.2075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1712, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1549, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.2353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.2081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1428, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1826, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.2514, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.2160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.2065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.3194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1619, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1436, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1600, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1665, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1531, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1443, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1925, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.2037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1867, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1581, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1655, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1678, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.2236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.2340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.2164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1722, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1725, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1684, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.3404, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1585, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1771, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1948, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.2688, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0993, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.2259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1881, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1439, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1471, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.2666, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1681, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1661, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1461, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.3983, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0895, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.2175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1693, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.2490, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1730, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1665, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1634, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1491, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1385, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1656, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1580, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.2037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1881, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.2043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.2222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1946, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1430, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.2241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1453, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1566, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1696, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.2056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.2976, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.2208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1753, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1524, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1456, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1842, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1425, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.2869, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.3182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1494, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1855, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.5422, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.2225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0950, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.4596, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1753, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1527, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.2154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1555, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1456, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.2189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1889, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1987, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.2540, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1598, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1521, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1389, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1792, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.3201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.2393, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1906, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.2344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1721, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1461, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1606, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.3991, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1999, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1427, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1478, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1631, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1543, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.3847, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1589, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.2629, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.2358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1984, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0909, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1427, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1755, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1640, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1810, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1702, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1877, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1404, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1629, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1469, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1973, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1022, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1452, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1720, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.2095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0798, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1711, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1527, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.2403, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1002, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0752, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1881, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1553, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.2624, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.2687, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1868, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0843, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.2234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.5523, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.2075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1724, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1540, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1899, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1709, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.3771, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.2813, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1467, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.2957, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1661, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1510, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1919, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1657, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1816, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.4780, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.3346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.2884, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.8960, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1971, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1855, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1006, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1881, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1927, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.2092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.2095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1630, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1526, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1383, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1482, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1755, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.3220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1839, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1582, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1393, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.3231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1776, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1568, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.3603, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1514, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1917, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.6024, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.2033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1878, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.5602, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.3319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1560, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1740, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1516, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0968, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.5403, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0845, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.2080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1798, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1463, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1445, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1514, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1955, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1649, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1828, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1816, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1013, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1444, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1812, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1979, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1691, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.2154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1814, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1493, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0911, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.3186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.2065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.3227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1820, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.2108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1795, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0766, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.2062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.2286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1476, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.2331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1591, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1555, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1488, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1906, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0955, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.2057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1681, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1502, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1947, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.2220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1825, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.2835, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.2478, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0954, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1423, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1440, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1623, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0837, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1386, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1899, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1671, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.2393, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1389, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1750, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1716, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1472, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1546, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0697, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1449, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1554, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.2146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1387, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1622, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1745, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1558, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.2267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0829, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1410, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1626, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0757, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1490, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0788, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1857, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1895, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1010, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1931, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1474, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.3347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.2036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.2287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1507, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1517, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0892, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.2100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1522, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1382, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1681, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1831, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1478, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1475, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1456, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1452, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.3781, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1434, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1398, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1752, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1749, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0942, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1861, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1848, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0990, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.3054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0945, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1617, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1524, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1576, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0902, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1733, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1443, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.2011, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0887, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0890, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1446, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0867, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1449, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1598, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1552, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1718, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1565, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.2106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1523, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1573, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0889, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1423, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0791, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0954, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1389, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.2274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1616, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1581, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1881, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.2198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1439, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1533, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.2808, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1866, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1480, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1431, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1659, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1653, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1697, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1526, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1397, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0994, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1482, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0873, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1001, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1559, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1814, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.2029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1390, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1773, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1391, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0893, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1521, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1691, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.2900, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1678, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1576, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.7226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.2143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0986, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.2404, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1965, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0993, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1546, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1531, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1862, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1404, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1669, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1546, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1630, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.3958, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0803, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.2467, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1770, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0667, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1489, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0983, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0913, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0906, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.4845, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.3811, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1704, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.2148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1613, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1818, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1538, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1697, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0728, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0989, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1396, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0916, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.2971, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1694, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1772, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1403, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1444, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1489, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0986, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1000, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.2886, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1642, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1529, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1569, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1434, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1728, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.2297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0959, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1404, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1457, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0982, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1482, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1565, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1417, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.2199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0765, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1683, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1393, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0802, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.2556, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.2974, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.3070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0823, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1393, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0584, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0988, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0859, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0866, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0887, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1414, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1714, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1412, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1003, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0707, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1904, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1609, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0910, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0935, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1426, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1619, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1011, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0754, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1020, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.5946, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0894, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.3985, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1679, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.4848, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.3581, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1393, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1430, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1447, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0949, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0902, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.2525, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1610, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1478, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1894, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.2648, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0881, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1690, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1490, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0757, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0731, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0806, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1569, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1534, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1419, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1892, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1617, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0991, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0816, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1578, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1772, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1450, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1601, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0936, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1025, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0850, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1605, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1956, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1379, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1395, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1598, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.3014, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0936, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1772, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0692, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1576, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1455, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0951, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.2107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0848, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1018, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1531, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1467, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0936, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1436, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1584, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1827, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0890, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.2236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.6556, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1382, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1584, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1382, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0726, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1622, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1614, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1397, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0992, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0906, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0540, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.2013, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1445, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0898, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1404, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0909, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1483, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1003, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.2757, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1732, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1419, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1534, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0758, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0822, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1944, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1625, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1818, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1021, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0806, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0882, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.4193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1711, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0988, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1000, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0813, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0986, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1567, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0967, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1385, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1396, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1395, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1859, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1735, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0682, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1439, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1575, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1700, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0844, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1523, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0966, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1501, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1434, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1395, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0724, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0785, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0639, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0938, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1418, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1484, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0899, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1511, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1435, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1507, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1759, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1682, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0668, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0891, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0857, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1384, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0910, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0913, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0892, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0735, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1564, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.3770, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0809, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0866, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.2536, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0789, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0840, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0983, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1018, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0943, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1508, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1509, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.3173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0855, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0713, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1635, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1819, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1430, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0877, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.2216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0755, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0958, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0995, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1006, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0938, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0900, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1582, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0952, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1446, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0824, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0994, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.2751, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1844, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1416, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0937, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0895, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0997, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1507, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.2595, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0879, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1764, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1460, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0916, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.2330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0947, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0894, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0957, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0833, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1658, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1468, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.3479, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0749, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.2614, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.2022, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0999, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.3978, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1564, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1009, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1677, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0960, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1422, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1796, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1561, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.2110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1007, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0941, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1400, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.2718, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1436, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1477, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1605, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1459, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0974, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.4838, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0979, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1414, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0841, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1553, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.3212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1502, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0898, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0814, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0803, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1433, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1555, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1389, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0928, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0939, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1011, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0918, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1562, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1021, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1579, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0949, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0843, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.2015, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0955, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1655, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1714, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0948, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.2177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0950, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0953, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1729, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1729, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1387, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0843, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0961, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1623, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0898, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0880, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0980, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1902, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0859, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.3550, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1516, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1848, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1764, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0858, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.4363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0971, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1610, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1502, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1008, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0998, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0864, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1864, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1860, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0995, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1461, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0897, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1442, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.2122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0783, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0908, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.2443, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0944, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0987, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1469, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0957, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1464, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0790, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1463, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0640, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1591, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1594, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0607, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0790, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0645, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0799, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0633, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0747, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.2002, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1398, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0897, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0990, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1824, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0698, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0675, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1711, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0629, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.2641, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0929, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0901, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0729, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1608, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0783, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1669, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0847, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0996, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0572, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1519, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.2288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0857, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0985, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0756, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1539, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0773, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.2714, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0942, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.3122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1450, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.4122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0997, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1625, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1711, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1577, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0991, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1018, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0794, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0710, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0926, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0857, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0934, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0772, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0777, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1543, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.6136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0955, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0760, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1506, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0751, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0714, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1958, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1898, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0888, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1481, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0694, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0857, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0819, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0783, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1487, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0914, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0909, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1876, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1540, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0894, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0746, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0790, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0622, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0757, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0866, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.2122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0709, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0905, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.4184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0909, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1937, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0972, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1562, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0705, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0714, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1589, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1523, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1676, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0740, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0828, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1958, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0737, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0510, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1012, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0649, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1984, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0857, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0551, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0733, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0630, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0768, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0974, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0737, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0714, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1389, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1661, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0979, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0855, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0890, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0775, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0977, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0798, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0905, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0859, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0843, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0806, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0895, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0887, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0923, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0816, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1021, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0806, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0987, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0938, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0937, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0980, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.4082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0811, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0775, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0657, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0852, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0849, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0996, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.3011, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0834, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0975, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0979, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0978, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0785, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0785, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0864, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0612, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0974, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0986, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0887, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0464, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0966, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0566, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0937, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0674, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0982, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0699, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0860, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0796, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0536, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0667, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0632, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0901, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1654, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0987, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1012, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1388, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0872, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0764, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0890, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0823, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0808, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0699, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0933, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0785, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0981, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0723, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0848, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0818, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0553, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0901, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0584, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.2469, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0907, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0632, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0828, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0838, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0635, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.2919, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0700, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0934, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.2389, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0697, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0759, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0881, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0888, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0559, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0676, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0698, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0805, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0812, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1896, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.2121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0796, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0935, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0812, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1011, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1471, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0849, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.2973, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0517, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0689, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0703, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1604, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0819, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0557, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0724, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0955, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0977, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0844, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0663, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0582, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0902, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0824, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0901, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1614, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1400, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1934, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0697, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0703, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0837, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1612, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0933, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.2942, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0953, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0793, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0763, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0600, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1408, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0735, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0682, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0846, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0741, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0693, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0806, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1451, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0875, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0979, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0809, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1738, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1481, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0677, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0735, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0719, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0837, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0877, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0639, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0572, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0842, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0566, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.3219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0642, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0716, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1402, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0871, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0903, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1414, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0830, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0920, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0858, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0908, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0904, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0559, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1835, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0852, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.3107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0624, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0526, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0387, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0726, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0778, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0743, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0853, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0512, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0728, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0572, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1535, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1397, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0759, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0684, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.3861, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0732, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0878, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0802, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0906, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0740, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0542, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0814, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1011, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0727, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0841, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0660, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0995, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1524, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0861, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0604, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0610, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0848, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0876, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0987, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0604, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0819, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1411, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1780, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0696, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0926, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0833, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0792, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0851, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1653, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0778, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0715, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1466, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0893, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1793, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0704, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0904, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0804, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0877, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0502, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0447, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0541, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0474, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0811, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0512, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0846, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0600, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0926, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0401, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0649, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0526, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1735, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0772, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0446, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0543, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0897, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0720, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0623, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0875, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0861, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0986, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0529, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0786, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0925, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0641, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0822, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1412, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1465, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.3155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0799, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0618, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0416, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0665, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0861, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0661, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0857, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0547, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0852, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0745, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0764, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1379, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0703, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0761, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0802, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0674, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0802, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0943, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0550, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0584, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0781, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0930, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1519, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0710, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0936, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0669, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1828, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0840, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0692, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0854, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0755, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0732, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0907, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0911, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0900, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0909, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0905, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0970, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0831, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1021, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1480, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.2597, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1014, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0761, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0754, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0997, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.3669, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0962, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0610, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0994, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0728, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0867, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1569, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0763, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0675, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1460, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0798, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0995, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0857, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0796, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1485, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0998, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0719, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0931, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1439, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0956, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0992, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0681, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0929, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0825, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1505, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0608, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0977, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0774, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0996, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0665, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0642, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0873, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0769, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0712, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0640, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0941, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0638, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0650, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0639, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0577, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.3075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0974, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0654, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0483, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0962, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0803, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0849, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0598, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1676, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0434, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0731, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0760, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0776, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0743, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0894, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0654, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0464, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.2931, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0839, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0881, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0797, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0891, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0957, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0759, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0853, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0910, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0647, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0688, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1929, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0449, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0797, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0508, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0927, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1392, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0765, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0873, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0951, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1008, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1425, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0427, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0808, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0878, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0546, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0848, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0912, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0796, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0796, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0697, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0784, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0555, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1437, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1022, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1524, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0647, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0631, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1683, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0824, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0635, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1710, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1603, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0897, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0986, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0900, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0559, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0603, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0868, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0805, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0949, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0616, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0556, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.2864, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1536, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1436, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0642, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0862, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0930, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0703, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0775, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0899, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0976, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0828, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0905, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0733, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0889, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0858, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0676, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0571, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0863, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.2770, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0754, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0437, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0601, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0704, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0700, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0894, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0742, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0897, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0696, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0748, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0427, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0904, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0916, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0833, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0573, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0816, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0983, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0535, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0778, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0572, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0571, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0687, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0812, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1934, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0706, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0757, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0486, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0998, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0588, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0904, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0678, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0695, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0842, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0947, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0986, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0841, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0828, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0632, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0536, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1537, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0964, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1024, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0481, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0816, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0636, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0679, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0666, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0555, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0843, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0617, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0447, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0738, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0895, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0874, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0621, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0839, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0830, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1618, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0718, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0874, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0525, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0793, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0544, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0901, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0976, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.2381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0428, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0529, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1572, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0409, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0495, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0706, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0939, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0757, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1448, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0651, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0806, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1929, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0714, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0937, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0793, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0657, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0831, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0912, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0835, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0872, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1476, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0579, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0895, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0814, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0953, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0720, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0761, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0651, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0831, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.2201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1018, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1457, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0746, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1634, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0617, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0660, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0984, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0658, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0736, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0952, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0825, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0931, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.2466, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0685, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0854, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0794, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0785, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0585, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0528, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0649, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0575, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0667, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0699, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0770, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0758, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0642, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0584, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0785, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0554, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0780, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0862, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0688, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0662, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0902, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1659, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0720, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0728, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0509, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0728, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0658, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0516, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0744, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0882, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.2850, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0663, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0509, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0615, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0842, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1005, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0916, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0624, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0452, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0860, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0489, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0629, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0994, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0884, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.2744, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0700, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0577, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0784, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0649, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1439, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0615, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0617, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0974, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0995, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0641, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0966, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0909, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0928, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0963, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1530, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0586, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0954, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0475, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1690, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0626, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0575, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0682, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0706, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0784, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0734, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0708, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0646, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0673, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0618, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0814, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0771, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0873, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0541, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0980, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0989, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0682, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0781, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0706, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0913, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0747, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0920, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0677, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.2874, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1805, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0807, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0719, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0681, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0564, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0991, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0687, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0582, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0568, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.2347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0849, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0649, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0520, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0990, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0745, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0801, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0693, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0593, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0585, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0667, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0686, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0629, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0828, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0945, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0432, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0540, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0813, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0475, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1414, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1520, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.2307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0436, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0408, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0588, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0868, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0905, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0679, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0962, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0876, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0633, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.3968, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0508, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0618, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.2389, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0858, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1564, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0687, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0586, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0775, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0994, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0883, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0906, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0789, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0738, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0720, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0638, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0540, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0889, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0948, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0899, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0972, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.3240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0408, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0996, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0666, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0879, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0991, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0493, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1008, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0836, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0871, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0521, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0698, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0473, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0590, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0712, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0977, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0622, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0732, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1594, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0652, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0646, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0873, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0774, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0951, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1615, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0755, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0957, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0598, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0913, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0828, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0544, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0707, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0812, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0834, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0451, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.2137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0386, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0525, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1001, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0880, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0673, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0756, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1479, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1010, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0765, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0740, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1406, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0522, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0415, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0958, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1612, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0749, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0564, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0815, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0537, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0729, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.2388, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0645, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0945, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0624, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.2513, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0881, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0705, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1453, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0940, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0865, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0387, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0875, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0650, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0578, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0695, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0605, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0856, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0694, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0839, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0736, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0825, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0792, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0680, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1020, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0918, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0867, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0771, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1023, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1557, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0514, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0876, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0491, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0841, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1561, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1014, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0748, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0727, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.2000, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0617, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0820, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0945, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0963, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0772, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0752, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0741, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1941, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0762, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0458, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0970, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0863, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0647, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0979, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0575, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0533, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0778, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0740, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0762, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0744, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1483, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0670, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0948, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0721, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0833, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0752, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0624, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0651, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1476, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0681, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0995, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0800, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0817, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0587, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0460, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0546, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1667, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0622, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0668, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0986, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0560, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0620, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0693, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.2649, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0802, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0807, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0770, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0947, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0849, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0468, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0436, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1677, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0602, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0733, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0972, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1006, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0547, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0884, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0392, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0729, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0908, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0821, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0520, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0460, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0965, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0666, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0525, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0564, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0657, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0804, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.4387, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0799, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0804, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0658, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0476, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0528, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.2219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0650, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1787, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0823, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0971, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0631, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0910, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0829, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1755, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0850, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0602, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0478, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0971, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0408, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0525, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0791, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0871, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0833, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0847, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0711, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0818, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0495, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0517, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0560, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1022, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0801, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0766, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0748, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0879, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0650, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0717, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0756, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0732, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.2810, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0792, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0734, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0618, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0874, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0865, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0569, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0507, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.2557, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0588, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0619, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0791, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0666, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0750, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0675, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0908, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0878, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0683, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1010, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0794, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0835, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0543, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0985, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0655, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0708, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0992, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0649, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0771, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0654, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0908, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0971, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0499, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0743, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0465, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0642, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0717, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0752, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0754, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0834, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0414, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0929, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0916, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0862, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0916, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0906, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0964, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0925, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1420, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0888, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0622, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0730, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0471, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0640, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1019, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0832, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0538, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0567, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0835, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1386, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0717, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0515, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0819, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0700, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0867, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0974, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0900, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0930, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0534, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0605, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0962, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0657, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0747, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0522, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0701, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0743, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0742, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0536, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0746, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0994, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0673, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0815, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0835, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0489, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0681, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0896, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0536, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1927, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0721, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0704, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0515, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0607, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0889, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0723, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0583, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0656, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0794, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0774, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0528, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1563, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0666, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0960, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0730, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0674, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0680, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1898, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1025, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1003, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0384, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0715, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0519, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0473, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0775, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0908, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0817, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0653, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0463, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0757, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0938, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0721, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0717, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0736, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0822, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1020, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0805, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0806, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0928, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0757, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0763, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0961, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0539, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0527, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0688, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0559, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0809, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0687, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0834, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0805, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.2352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0920, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0696, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0799, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0492, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0590, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0556, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0681, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0720, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0589, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0687, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0864, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0779, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1858, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0984, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0679, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1011, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0938, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0528, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0843, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0954, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0657, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0642, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0767, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0805, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0779, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1024, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0609, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0663, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0726, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0856, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0595, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0832, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0525, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0500, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0917, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0935, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.4237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0843, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0725, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.3141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1020, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0785, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0586, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0858, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0632, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0870, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0537, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0424, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0725, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1412, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0659, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0769, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0582, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0623, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0794, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0997, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0608, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0758, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0517, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0424, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0791, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0700, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0550, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0774, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0594, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0457, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0839, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0771, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0847, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0723, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0967, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1025, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0661, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0785, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1943, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0715, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0552, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0774, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0565, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0535, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0635, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0939, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.4866, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0620, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0760, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0702, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0546, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0830, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0416, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0664, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1021, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0549, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0904, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0678, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0831, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0969, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0722, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0568, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0916, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0955, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1625, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1016, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0709, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0849, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0407, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.3054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0393, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0935, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0605, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1001, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0846, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0906, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0963, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0704, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0773, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0579, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0736, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0950, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0706, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0703, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0386, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0798, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0836, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0803, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0678, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.3621, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0776, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0648, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0724, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0469, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0435, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0810, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0866, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0606, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0679, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0926, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0402, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0578, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0895, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0685, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0735, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0981, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0419, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0855, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0606, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0538, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0987, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0396, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0759, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0623, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0938, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0507, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0682, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0648, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0908, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0455, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1003, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0971, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0552, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0578, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0448, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0546, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0722, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0598, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0628, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0713, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0735, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0578, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0696, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0851, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0657, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1579, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0865, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0557, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0659, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0438, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0680, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0775, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.2113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0586, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.2056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0428, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0790, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0523, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1911, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0925, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0533, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0498, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0726, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0404, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0779, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0932, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0829, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0711, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0392, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0653, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0749, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0961, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0729, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0573, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0571, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0624, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0423, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0862, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0970, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0838, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1938, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0712, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0482, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0788, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0867, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0957, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0855, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0690, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0744, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0600, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0801, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0993, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0692, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0676, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0687, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0755, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0455, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0544, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0596, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0800, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0951, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0516, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0710, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0836, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1781, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0748, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0975, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0673, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0588, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0439, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0533, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0830, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0896, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.4593, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0675, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0937, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0900, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0683, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0929, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0872, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0888, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0597, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0678, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0857, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0707, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0859, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0765, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0647, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0914, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.2590, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0667, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0895, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0947, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0550, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1513, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0737, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.2193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1746, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0835, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0950, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0551, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1583, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0556, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0589, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0759, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0912, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0821, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0583, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0794, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0834, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0617, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0780, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0988, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0538, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0773, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0731, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0959, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0495, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0691, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0574, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0545, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0751, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0860, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0693, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0920, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0744, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1470, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0892, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0701, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0536, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0626, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.2655, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0887, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0932, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1485, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0659, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0766, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0598, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0701, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0518, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0955, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0813, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0751, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0726, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1510, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0975, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0745, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0642, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0508, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0427, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0462, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0506, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0548, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0810, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0662, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0728, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0849, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0525, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0613, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0382, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0745, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0739, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0777, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0795, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0710, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1016, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0464, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0645, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0665, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0850, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0530, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0923, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0926, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0734, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0635, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0904, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1009, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.2248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0848, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0609, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0955, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0474, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0491, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0923, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0675, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.3045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0610, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0850, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0477, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0807, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0621, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0698, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0744, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0583, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0807, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0910, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0457, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0572, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1530, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0542, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0665, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0997, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0624, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0832, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1007, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0626, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0762, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0941, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0977, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0708, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0712, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0639, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0520, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0950, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0675, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0622, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0668, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0748, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0546, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0866, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0456, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0847, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0739, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0663, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0668, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0665, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0917, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0692, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0563, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0700, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0565, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0656, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0855, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0520, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0922, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0742, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0520, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0948, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1527, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0536, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0922, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0822, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0467, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0864, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0683, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0624, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0720, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0710, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0848, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0426, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0847, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0774, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0860, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0696, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0390, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0434, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0583, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0991, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1006, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0878, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0464, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0480, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0906, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0689, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0758, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0763, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0602, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0657, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0710, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0916, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0662, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0558, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0801, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.2382, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0924, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0831, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0740, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0892, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0848, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0661, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0858, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0666, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0503, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0540, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0529, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0989, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0525, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0462, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0571, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0863, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0748, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0577, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0540, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0473, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0632, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1439, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0596, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0942, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0776, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1928, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1704, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0471, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0759, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0656, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0723, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1000, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0645, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0696, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0663, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0792, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0616, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0897, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0695, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.2284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0570, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0662, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0483, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0797, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0518, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0444, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0743, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0720, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0718, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.2377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0675, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0788, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0471, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0643, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0784, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0983, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0901, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0852, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0836, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0606, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0931, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0765, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0945, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0436, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0766, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1870, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0744, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0658, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0687, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0535, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0669, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0417, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0780, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0514, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0563, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0978, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0610, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0712, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0428, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0645, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0999, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0758, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0661, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0639, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0509, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0489, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0608, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0588, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0733, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0843, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0610, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0784, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0534, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0419, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0775, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0537, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0936, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0621, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0731, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0741, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0400, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0988, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0658, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0749, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0956, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0444, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0633, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0976, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0629, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0581, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0978, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0877, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0822, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0660, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.2959, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.2370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1786, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.3031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0928, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0545, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0589, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0448, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0785, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0512, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0475, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0735, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0551, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0989, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0795, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0959, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0604, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0612, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0737, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0517, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0660, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0868, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0429, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0601, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0621, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0437, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0611, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0548, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0684, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0743, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0618, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0504, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0900, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0908, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0742, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0923, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0645, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1019, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0769, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0629, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0429, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0653, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0599, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0915, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.2930, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0680, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0495, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0692, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0944, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0722, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.2045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0821, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0819, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0706, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0702, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0548, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0479, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0736, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0569, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0438, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0827, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1396, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0429, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0619, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0494, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0412, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0932, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0533, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0473, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1502, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0854, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0803, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0741, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0635, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0873, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0710, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0534, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0617, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0561, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0997, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0498, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0775, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0498, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0675, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1012, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0675, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0772, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0712, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0628, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0649, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0742, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0488, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0826, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0963, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0940, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0499, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.2663, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0554, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0624, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0941, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0689, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0897, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0841, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0790, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.2243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0547, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0563, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0504, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0548, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0836, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0528, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0636, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0630, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0756, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0897, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0688, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0653, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0906, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0400, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0511, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0506, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0520, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0930, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0921, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0960, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0931, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0497, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0507, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0788, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0971, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0808, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0391, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0696, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0964, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0585, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0556, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0516, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0547, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0916, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0515, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0881, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0469, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0839, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0935, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0708, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0895, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0567, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0802, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0766, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0422, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0455, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0667, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0700, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1388, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0763, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0660, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0676, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0645, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0787, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0497, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0675, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0604, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0548, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0736, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0818, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0820, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0785, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0635, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0573, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0837, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0545, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0626, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0777, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0706, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0666, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1552, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0732, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0884, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0598, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0829, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0594, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0967, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0506, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0429, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0782, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1595, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0858, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0930, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0944, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0830, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0472, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0926, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0481, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0739, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0669, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0680, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0607, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0560, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0585, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0797, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0674, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0975, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0654, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0858, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0612, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1395, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0668, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0476, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0594, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1812, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0820, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0554, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0440, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0848, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0623, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0483, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0807, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1765, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0985, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0869, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0664, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0880, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0425, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0667, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1010, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0769, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0956, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0812, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0476, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0390, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1439, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0614, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0770, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0942, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0855, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0829, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0827, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0751, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0780, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0845, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0759, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0925, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0597, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0662, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0965, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0848, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0764, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1021, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0643, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0875, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0826, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0765, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0693, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0697, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0566, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0825, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0633, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1022, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0569, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0640, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0686, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0470, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0871, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0820, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0736, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0476, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0879, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0448, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0777, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0732, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0727, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0799, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0537, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0735, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1025, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0479, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.4963, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0477, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0804, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0848, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0730, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0848, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0614, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0728, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0838, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0991, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0386, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0775, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0789, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0482, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0918, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0979, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.2317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0718, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0857, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0558, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0645, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0607, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0865, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0589, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0833, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0411, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0558, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0541, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0732, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0626, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0809, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0532, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0786, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0799, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.3519, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0769, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0456, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0928, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0646, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0410, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0851, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0824, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0704, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0396, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0746, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0600, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0505, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0400, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0794, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0460, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0749, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0547, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0509, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0984, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0860, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0959, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0614, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0517, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1002, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0595, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0607, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0481, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0784, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1470, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0730, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1544, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0458, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0909, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0725, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0725, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0878, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0704, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0389, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0722, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0634, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0443, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0502, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0966, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0835, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0658, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0995, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0990, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0739, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0604, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0951, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0848, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0645, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0391, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0593, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0937, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0683, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0494, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0945, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0785, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0780, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0740, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0642, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0605, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0848, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0947, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0549, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0596, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0812, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0883, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0809, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1538, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1422, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1951, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0938, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0646, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0599, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0491, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0792, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0464, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0933, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0411, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0577, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.2714, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0703, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0720, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0385, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0935, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0692, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0811, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0795, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0630, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0715, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.2993, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0649, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0733, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0879, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0866, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0422, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.2022, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0642, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0429, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0485, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0444, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0978, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0865, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0766, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0645, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0886, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0715, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1451, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0933, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1658, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0734, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0646, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0886, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0884, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0949, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0690, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0676, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0766, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0566, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0426, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0584, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0701, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0616, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0613, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0846, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0577, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0951, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.2813, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0877, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1013, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0481, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0393, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0708, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0709, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0990, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0771, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0493, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0972, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0665, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0420, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0458, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0574, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0430, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0533, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0571, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0586, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0761, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0880, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0760, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0610, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0911, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0692, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0856, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.5291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0514, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0756, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0513, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0812, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0670, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0581, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0904, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0485, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0456, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0885, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0826, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0498, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0782, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0759, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0785, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0445, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1665, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0696, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0852, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0958, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0870, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0604, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1492, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0902, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0689, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1018, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0549, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0652, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.2334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0739, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0782, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0548, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0694, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0490, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0601, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0732, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0552, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1992, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0753, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0584, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0686, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0643, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0817, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0706, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0675, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0737, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0439, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0833, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0386, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0505, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0689, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0519, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0600, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0603, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0620, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0423, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0629, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0949, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0561, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0450, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0595, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0816, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0494, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0477, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1006, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0506, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0786, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0843, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0800, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0820, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0455, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0482, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0605, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0838, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0813, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0864, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0804, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0632, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0567, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0649, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0933, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0829, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0999, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0595, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0792, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0864, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0543, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0604, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0908, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0585, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0450, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0784, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0545, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0454, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0803, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0547, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0856, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0769, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1435, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1023, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0849, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0611, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0684, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0561, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0549, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0494, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0504, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0541, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0665, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0684, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0718, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0600, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1412, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0451, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0455, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0538, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0570, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0624, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0990, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0866, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0620, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0767, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0987, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0525, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0552, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0593, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0472, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0639, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1010, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0605, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0710, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0830, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0476, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0807, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0664, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0414, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.5325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0523, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0511, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0766, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0909, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0640, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0700, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0861, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0665, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1005, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0630, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0597, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0619, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0816, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0841, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0466, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0513, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0743, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0639, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1392, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0506, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0675, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0673, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0562, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0420, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0631, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0668, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0515, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0764, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0914, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0742, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1708, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0839, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0447, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0588, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0531, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0786, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0426, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0665, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0963, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1518, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0554, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0586, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0834, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0622, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0744, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0586, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0822, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0768, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1649, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0849, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0931, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0674, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0480, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1000, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0568, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0517, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0479, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0569, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0537, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0473, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0719, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0478, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0731, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0529, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1770, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0849, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0582, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0756, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0552, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0613, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0669, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0629, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0758, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0894, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0701, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0662, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0814, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0453, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0538, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1785, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0599, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0719, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0548, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0639, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0529, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0458, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0694, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0608, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0786, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0624, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0638, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0779, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0791, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0724, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0768, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0794, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0718, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0466, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0870, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0805, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0714, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0440, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0826, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0618, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0556, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0513, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0464, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0869, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0749, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0634, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0820, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0867, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0562, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0995, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0451, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0774, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0509, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1010, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0477, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1420, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0879, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0659, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0831, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0944, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0724, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0445, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1731, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0697, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0521, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0667, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0528, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0858, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0834, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0674, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0813, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0911, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0956, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0646, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.2515, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0737, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0677, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0506, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0575, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0903, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1600, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0855, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0771, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1501, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0419, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0763, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0522, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1018, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0620, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0461, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0955, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0567, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0683, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1784, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0559, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0629, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1910, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0650, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0925, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0963, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0398, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0569, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0760, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0801, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0892, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0683, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0529, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0425, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0834, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1017, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0700, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0750, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0740, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0505, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0726, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0775, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0626, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0834, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0664, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0783, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1013, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0554, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0425, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0816, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0947, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0769, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0891, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0599, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0414, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0537, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0624, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0673, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0599, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0545, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0407, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0458, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0678, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0646, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0781, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0690, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0612, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1745, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0541, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0635, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0577, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0526, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0445, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0959, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0848, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0500, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0467, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0707, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0573, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0941, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0928, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0764, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0864, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0590, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0454, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0657, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0790, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0789, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0397, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0679, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0763, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0685, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0924, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0598, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0530, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0533, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0471, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0683, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0770, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0763, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0572, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.4107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0534, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0782, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0550, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0468, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0860, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0740, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0539, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0508, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0423, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0454, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0536, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0589, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0813, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0590, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0691, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0665, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0448, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0530, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0438, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0638, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0635, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0799, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0492, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0562, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0544, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0502, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0534, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0485, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0786, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0708, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0548, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0577, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0655, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0625, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0741, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.2839, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0612, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0416, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0574, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0507, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0668, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0475, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0669, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0689, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0551, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0688, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0761, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0749, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0858, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0854, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0579, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0837, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0937, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0573, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0437, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0616, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0633, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0568, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0710, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0674, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0886, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0473, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0595, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0780, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0487, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0542, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0450, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0778, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0582, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0514, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1011, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0959, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0714, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0386, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0498, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0763, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0409, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0658, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0799, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0502, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0460, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0402, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0588, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0942, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0706, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0565, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1008, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1003, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0667, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0548, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0845, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0655, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0897, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0384, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0735, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0887, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0493, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0735, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0805, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0638, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0705, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0640, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0782, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0734, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0818, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0814, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0696, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0797, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0802, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0949, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0810, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0780, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0611, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0646, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1836, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0810, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0940, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0456, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0638, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0621, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0809, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1523, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1527, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1564, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1383, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1801, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1593, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1404, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1559, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1706, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0928, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0954, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0926, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0870, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0985, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1634, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1800, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1500, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0885, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0858, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1021, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0893, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0782, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0898, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0791, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1708, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0938, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0981, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0982, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1450, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0998, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1444, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1620, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0975, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1653, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1456, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0867, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0502, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0667, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0919, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0667, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1423, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0899, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0914, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1024, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0927, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0845, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0770, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0754, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0923, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0994, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1450, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0648, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0782, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0700, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.2670, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0930, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0762, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1420, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0755, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0789, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1021, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1409, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0651, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0759, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0846, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0791, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0716, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1009, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0826, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0542, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0972, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0682, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0478, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0643, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0881, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0531, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0583, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1009, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0723, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0693, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0700, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0907, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0454, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0851, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.2272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0894, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0581, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0646, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0907, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0879, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0968, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0949, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1014, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0900, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1868, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0815, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0877, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.2771, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1462, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1008, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0770, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0782, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0680, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0720, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0696, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0753, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0794, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0944, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0814, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0536, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0730, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1015, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0937, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0612, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1404, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0562, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0593, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0708, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0778, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1005, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0847, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1003, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0697, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0786, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0687, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0972, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0717, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0827, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0507, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0748, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0690, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0625, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0760, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0787, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0689, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0924, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0850, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0946, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0508, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0952, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0610, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0839, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0954, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0821, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.2436, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1452, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0765, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0905, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0532, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0584, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.3436, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0951, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0581, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0778, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0987, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0810, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0592, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0528, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0705, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1545, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1003, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0620, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0938, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0848, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0811, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0919, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1713, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0636, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0645, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0640, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0768, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0602, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0778, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0501, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0793, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1017, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0787, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0865, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1006, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0891, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0656, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0640, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1012, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0887, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0708, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0707, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0769, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0846, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0782, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0429, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0622, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0972, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0549, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0458, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0711, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0813, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0661, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.2863, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0781, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0659, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0851, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0968, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0676, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0935, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1379, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0473, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0807, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0656, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0725, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0778, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0713, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0984, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0934, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0417, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0953, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0613, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0492, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0547, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0931, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0992, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0902, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0646, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0961, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0422, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0634, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0516, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0764, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1811, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0622, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0775, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0924, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0823, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0890, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0617, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1020, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0384, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0877, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0547, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0592, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0795, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0657, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0622, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0571, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0908, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0626, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0602, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.3351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0684, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1007, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0845, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0975, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0705, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0762, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0754, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0809, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0801, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0453, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0495, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0804, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0883, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0828, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1014, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0396, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0705, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0556, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0399, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1540, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0728, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0869, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0894, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0965, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0574, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0881, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0905, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0747, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0834, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0794, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0629, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0756, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0704, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0635, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1023, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0836, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0408, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0828, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0545, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0832, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1384, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0688, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0856, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0911, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0852, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0751, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0946, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0943, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0967, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0789, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0885, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0832, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0783, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0527, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0722, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0806, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0708, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0602, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0655, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0800, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0977, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0923, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0630, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.3083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1509, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0583, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0763, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0655, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0637, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0749, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0977, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0571, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0610, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0535, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0894, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0611, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0726, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0837, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0641, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.2545, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0938, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0948, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0576, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0419, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0589, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0848, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0681, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0482, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0606, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0935, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0679, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0504, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0712, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0659, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1686, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0549, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0948, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0621, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0712, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0710, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0447, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0874, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0812, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0919, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0540, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0460, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0715, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0631, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0839, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0926, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0723, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0826, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0492, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1020, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0850, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0778, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0632, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0867, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0550, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0690, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0777, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0601, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0666, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1577, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0794, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0643, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0621, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0455, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0687, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0624, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0651, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0654, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0832, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0619, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0622, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0901, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0746, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0889, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0929, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0697, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0905, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0989, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1829, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0832, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0857, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0985, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0540, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0464, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0657, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0876, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0897, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0483, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0694, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0702, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0807, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0948, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0407, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0698, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0460, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1475, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0581, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0768, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0914, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0791, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0557, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0557, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0918, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0657, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0544, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0579, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0507, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0790, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0843, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1529, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0668, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0784, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0743, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0895, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0709, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0695, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0483, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0641, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0529, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0510, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0728, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0895, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0693, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0936, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0584, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0611, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0716, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0439, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0442, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0790, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0712, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0790, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1483, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0632, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0384, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0592, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0408, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0931, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1014, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0693, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0748, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0398, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0597, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0834, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0879, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0902, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0841, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0828, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0811, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0857, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0864, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1469, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0677, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1001, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0958, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0626, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0458, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0566, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0669, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0967, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0843, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0703, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0834, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0680, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0631, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0781, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0637, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0523, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0693, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0663, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0633, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0896, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0810, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0770, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0413, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0537, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0428, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0917, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0764, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0442, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0624, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0671, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0769, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0881, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0916, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0785, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1791, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0786, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0860, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0767, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0643, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0561, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0809, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0476, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0669, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0995, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1015, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0956, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0535, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0466, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0869, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1554, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0650, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0760, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0666, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0720, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0909, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0457, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0785, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0893, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0887, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0566, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0830, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0709, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0908, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0672, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0709, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0629, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0592, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0696, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0751, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0733, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0510, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0714, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0434, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0459, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0633, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0703, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0576, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0656, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0838, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0613, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0662, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1434, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0831, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0499, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0612, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0676, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0470, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0812, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0642, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0804, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0677, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0567, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0576, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1003, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0728, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0654, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0784, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0620, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0656, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0671, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0771, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0810, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0716, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0777, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0862, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0771, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0397, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0655, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0921, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0929, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0557, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0481, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0671, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0990, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0442, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0728, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0512, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0694, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0962, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0803, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0535, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0437, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0751, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0659, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0595, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0641, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0462, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0720, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0628, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0583, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1001, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0543, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0558, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0617, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0539, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0676, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0798, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0412, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1435, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0602, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0458, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0667, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0634, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0629, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0850, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.2286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0880, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0783, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0385, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0450, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0506, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0565, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0682, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1663, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0576, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0705, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0737, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0969, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0819, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0827, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0545, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0411, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0785, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0500, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0574, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0546, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0708, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0579, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0409, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1536, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0715, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0654, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0730, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0752, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0424, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0721, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0934, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0525, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0711, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0656, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0730, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0515, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0881, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0805, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0412, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0470, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0501, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0623, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0663, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0884, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0757, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0627, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0922, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0823, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0735, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0630, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1383, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0504, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0776, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0734, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0614, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1391, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1003, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0651, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0565, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0523, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0736, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1418, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0451, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0844, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0841, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0741, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0650, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0665, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0392, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0858, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0649, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0938, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0642, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0693, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0414, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0439, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0488, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0619, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0672, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0884, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0687, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0495, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0771, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0732, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0971, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0451, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0898, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0996, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0640, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0687, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0547, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0549, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0592, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0516, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0972, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0655, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0500, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0977, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0471, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0902, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0687, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0593, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0551, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0534, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0414, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0804, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0566, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0724, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0824, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0821, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0601, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0622, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0679, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1671, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0614, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0583, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0827, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0747, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0573, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0747, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0584, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0542, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0812, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0569, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0870, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0786, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0581, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0630, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0495, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0996, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0730, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0895, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0552, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0946, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0592, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0632, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0636, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0639, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0796, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0861, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1605, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0795, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0889, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0487, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0576, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0742, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0842, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0562, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0712, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0577, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0392, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0576, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0720, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0611, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0566, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0894, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0532, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0563, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0621, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0828, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0667, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0637, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0584, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0875, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0504, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0713, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0556, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1694, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0563, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0680, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0738, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0510, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0745, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0679, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0574, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0692, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0769, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0834, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0627, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0578, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0461, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0901, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0956, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0549, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0632, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0616, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0451, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0943, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0938, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0847, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0755, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0781, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0516, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0464, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0974, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0546, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0759, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0764, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0502, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0795, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0843, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0694, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0802, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0747, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0908, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0457, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0884, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0515, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0680, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0920, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0438, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0891, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0938, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0457, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1004, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0822, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0636, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0746, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0578, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0799, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0894, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0778, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0977, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0601, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0530, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0670, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0632, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0412, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0485, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0482, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0612, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0682, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.2847, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0427, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0788, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0537, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0494, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0920, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0735, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0832, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0574, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0387, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0775, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0650, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0782, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0622, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0468, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0396, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0732, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0849, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0702, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0586, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0475, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0621, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0578, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0583, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0694, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0484, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1013, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0883, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0656, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0670, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0699, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0575, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0817, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0447, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0887, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0579, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0569, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0480, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0636, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0551, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0462, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0553, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0687, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0765, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0489, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0918, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0608, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1005, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0681, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0583, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0880, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0484, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0665, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1429, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0655, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0450, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0441, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0469, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0851, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0855, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0893, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0758, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0942, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0595, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0759, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0558, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0552, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0675, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0647, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0476, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0472, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0738, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0480, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0508, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0482, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0445, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0840, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0791, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0746, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0875, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0934, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0941, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0716, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0578, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0453, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0581, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0745, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0927, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0885, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0910, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0731, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0578, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0795, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0660, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.3242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0624, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0560, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0780, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0507, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0737, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0881, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0455, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0854, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0477, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0460, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0478, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.2059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.2117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1541, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0825, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0496, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.2995, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0382, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0588, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0558, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0562, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1516, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0709, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0598, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0943, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0635, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0529, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0486, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0455, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0461, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0689, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0503, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0678, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0933, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0675, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0408, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0734, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0879, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0524, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0595, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0760, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0725, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0516, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0675, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0811, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0458, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0675, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0594, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0612, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0981, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0753, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0559, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0443, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1005, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0463, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0558, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0554, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0650, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0700, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0742, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0858, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0670, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0445, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0534, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0877, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0671, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0515, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0556, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0758, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0997, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0763, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0893, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0792, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0600, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0545, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0517, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0403, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0946, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0520, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0887, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0728, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0642, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0558, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0621, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0605, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0916, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0383, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0829, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0877, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0749, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0585, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0791, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0414, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0625, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0707, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0726, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0912, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0797, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0685, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0601, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0599, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0765, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0938, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1977, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0719, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0486, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0590, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0745, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0780, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0435, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0783, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0581, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0709, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0865, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0777, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0637, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0813, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0852, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0625, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0457, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0788, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0513, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0831, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0566, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0883, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0615, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0930, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0888, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0895, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0720, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0811, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0574, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0654, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0730, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0476, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0449, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0651, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0585, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0872, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0822, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0538, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0596, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0821, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1021, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0604, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0721, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0533, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0781, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0701, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0540, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0487, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0707, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0738, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0531, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0920, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0564, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0502, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0454, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0664, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0852, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0581, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0664, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0961, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0730, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0883, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0869, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0407, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0559, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0816, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0443, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0435, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0438, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0481, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1708, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0407, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0490, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0622, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0944, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0566, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0559, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.2369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0643, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1019, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1015, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0457, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0411, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0562, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0738, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0645, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0795, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0910, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0527, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0455, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0654, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0492, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0493, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0665, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0985, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0777, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0783, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0843, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0498, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0849, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0660, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0789, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0814, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0674, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0850, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0460, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0564, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0541, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0478, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0961, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0582, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0807, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0899, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0989, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0541, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0397, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0951, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0650, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0768, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0483, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0538, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0464, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0664, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0507, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0707, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0491, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0492, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0651, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0841, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0479, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0684, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0495, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0712, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0416, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0538, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0656, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1016, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0547, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0497, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0667, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0664, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0558, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0915, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0654, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0531, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0781, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0434, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0758, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0601, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0757, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0597, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0403, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0537, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0561, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0640, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0537, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0589, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0700, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0510, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0737, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0777, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0393, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0646, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0582, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0567, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0955, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0608, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0776, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0713, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0633, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0383, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0548, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0623, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0689, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0525, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0471, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0496, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0627, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0598, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0736, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0552, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0493, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0977, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0493, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0644, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0412, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0754, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0584, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0549, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0534, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0641, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0589, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0414, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0617, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0507, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0637, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0488, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0663, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0763, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0705, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0630, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0501, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0616, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0751, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0477, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0557, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0888, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0506, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0611, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0732, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0851, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0661, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0634, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0741, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0934, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0611, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0732, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0709, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0475, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0727, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0667, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1025, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0487, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0892, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0486, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0527, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0601, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0637, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0424, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0645, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0587, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0445, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0989, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0627, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0513, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0732, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0831, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0442, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0888, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0715, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0861, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0623, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0728, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0454, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0448, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0661, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0671, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0683, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0575, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0664, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0622, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0499, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0875, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0430, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0732, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0710, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0612, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0734, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0793, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0627, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0647, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0718, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0652, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1773, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0569, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0396, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0657, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0878, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0485, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0827, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0957, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0742, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0662, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0938, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0730, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0992, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0682, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0767, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0407, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0817, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0783, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0968, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0810, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0990, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0684, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0401, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0616, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0785, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0798, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0502, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0474, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0562, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1559, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0409, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0993, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0738, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1025, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0441, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0534, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0920, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0638, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0688, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0495, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0393, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0636, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0825, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0402, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0594, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0568, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0659, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0482, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0416, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0676, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0929, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0433, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0604, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0799, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.2131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0640, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0648, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0793, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.2235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0701, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0517, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0778, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0463, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0417, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0822, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0578, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0645, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0522, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0903, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0514, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0896, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0383, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0919, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0810, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0857, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0725, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0777, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0596, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0475, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0588, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0577, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0649, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0565, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0816, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0844, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0421, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0480, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0616, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0664, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0773, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0727, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0762, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0720, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1417, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0591, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0388, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0621, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0697, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0562, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0518, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0585, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0629, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0632, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0463, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1431, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0470, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0668, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0600, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0700, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0457, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0671, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0693, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0608, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0599, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0596, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0687, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0778, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0551, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0443, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0623, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0655, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0893, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0471, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0582, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0527, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1456, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0988, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0434, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0623, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0662, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0659, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0634, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0738, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0661, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0881, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0664, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0419, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0453, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0836, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0497, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0595, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0613, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0657, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0513, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.2312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0575, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0443, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0519, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0451, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0559, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0417, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0402, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0430, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0810, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0755, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0574, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0431, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0594, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0663, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0703, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0958, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0591, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0474, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0944, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0755, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0611, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0404, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0392, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1671, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0573, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0467, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0732, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0421, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0650, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0382, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0841, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0584, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0516, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0820, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0686, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0449, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0699, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0659, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0804, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0431, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0965, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1014, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0470, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0508, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0832, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0647, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0607, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0427, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0926, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0696, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0999, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0846, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0881, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0450, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0688, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0417, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0704, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0819, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0427, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0644, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0588, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0382, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0594, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0495, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0584, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0574, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0714, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0911, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0476, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.2524, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0955, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0689, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0934, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0821, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0463, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0420, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0700, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0828, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0417, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0474, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0409, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0603, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0832, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0850, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0514, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0809, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0614, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0524, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0706, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0627, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0613, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0425, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0418, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0526, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0527, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0593, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0444, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0478, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0962, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0524, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0453, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0697, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0712, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0471, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0472, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0796, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0593, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0696, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0806, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0501, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0400, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0910, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0815, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0800, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0469, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0526, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0481, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0530, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0558, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0502, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1916, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0417, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0493, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0501, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0401, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0588, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0976, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.3410, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0497, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0594, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0525, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0584, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0571, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0581, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0450, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0926, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0704, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0454, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0748, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0804, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0768, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0530, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0767, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0532, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0566, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0473, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0769, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0424, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0602, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0479, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0603, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0460, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0716, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0713, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0744, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0557, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0611, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0870, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0461, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0739, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0388, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0605, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0498, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0603, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1446, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0844, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1710, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0753, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0890, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0435, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0449, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0864, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0482, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0575, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0423, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0992, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0924, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0636, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0861, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0584, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0470, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0425, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0531, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0720, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0530, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0793, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0947, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0505, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0829, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1514, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0808, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0404, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0676, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0448, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0624, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0726, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0488, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0429, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0687, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0611, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0518, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0694, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0716, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0534, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1804, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0687, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0634, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0636, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0388, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0393, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0437, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0430, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0578, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0456, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0426, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0859, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0397, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0635, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0433, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0474, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0584, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0697, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0511, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0398, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0652, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0549, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0625, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0758, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0821, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0420, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0429, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0880, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0428, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0399, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0739, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0413, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0594, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0684, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0585, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0822, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0452, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0551, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0379, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0636, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0405, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1018, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0480, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0432, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0791, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0502, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0459, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0696, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0603, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0719, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0673, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1646, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0505, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0839, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0465, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0606, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0694, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0939, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.2207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0648, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0407, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0504, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0506, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0393, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0610, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0461, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0475, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1013, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0743, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0787, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0878, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0653, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0538, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0508, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0467, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0708, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0413, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0850, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0898, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0599, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0385, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0563, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0509, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0971, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0390, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0579, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0738, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0806, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0598, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0666, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0589, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0931, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0709, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0900, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0848, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0741, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0492, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0416, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0598, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0522, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0507, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0652, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0646, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0574, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0940, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0513, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0996, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0415, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0647, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0758, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0486, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0444, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0545, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0456, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0632, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0865, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0679, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0979, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0647, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0578, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0519, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0436, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0601, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0455, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0673, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0635, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0705, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0742, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0525, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0682, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0668, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0718, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0913, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0860, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0578, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0482, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0640, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0838, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0605, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0455, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0877, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0750, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0813, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0736, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0597, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0703, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0514, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0855, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0619, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0596, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0590, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0393, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0763, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0594, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0451, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.3333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0435, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0510, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0807, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0511, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0929, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0620, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0503, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0776, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0482, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0678, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0723, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0597, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0624, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0557, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0756, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0632, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0732, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0491, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0424, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0623, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0927, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0740, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0891, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0623, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0404, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1005, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1563, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0560, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0455, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0436, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0639, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0562, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0654, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0777, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0408, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0710, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0792, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0772, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0720, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0714, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0453, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0875, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0405, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0435, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0635, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0427, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0660, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0852, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0515, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0856, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0948, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0540, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0533, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0633, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0497, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0467, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0535, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0383, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0399, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0853, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0412, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0667, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0657, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0909, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0439, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0805, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0513, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0383, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0399, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0512, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0493, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0708, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0993, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0441, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0551, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0623, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0547, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0551, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0889, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0690, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0527, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0617, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0650, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0450, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0696, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0413, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0437, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0708, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0423, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0386, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0621, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0451, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0393, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0437, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0492, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0442, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0464, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0577, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0578, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0650, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0976, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0896, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0532, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0502, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0560, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0418, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1399, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0417, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1000, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0491, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0442, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0452, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0702, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0508, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0558, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0464, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0761, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0543, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0404, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0660, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0525, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0564, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0595, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0516, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0461, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0612, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0805, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0598, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0531, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0645, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0439, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0411, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0991, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0797, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0727, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0708, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0462, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0382, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0819, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0490, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0599, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0664, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0446, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0684, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0504, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1022, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0566, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0771, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0554, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0745, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0535, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0875, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0593, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0457, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0517, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0775, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0446, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0491, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0565, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0386, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0630, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0785, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0522, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0687, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0480, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0849, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0816, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0762, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0383, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0560, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0531, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0512, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0842, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0583, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0543, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0999, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0952, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0505, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0402, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0522, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0517, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0568, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0721, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0519, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0704, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0551, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0474, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0488, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0734, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0554, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0543, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0732, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0581, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0468, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0524, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0678, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0468, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0635, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0537, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0426, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0419, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0607, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0671, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0404, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0760, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0539, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0617, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0899, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0594, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0470, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0580, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0624, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.3226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0704, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0596, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0505, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0798, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0479, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0484, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0516, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0492, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0506, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0547, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0583, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0650, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0389, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0571, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0455, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0735, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0604, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0734, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0423, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0666, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0975, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0474, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0445, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0383, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0421, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0726, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0728, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0593, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0845, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0969, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0398, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0477, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0602, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0848, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0499, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0546, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0581, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0874, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0382, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0423, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.2082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0514, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0650, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0559, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0514, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0602, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0411, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0488, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0506, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0697, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0662, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0585, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0487, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0710, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0757, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0424, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0421, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0701, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0601, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0858, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0499, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0505, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0484, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0545, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0774, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0549, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0590, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0409, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0732, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0505, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0388, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0914, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0439, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0906, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0587, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0724, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0487, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0726, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0382, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0574, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0400, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0676, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0579, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.2468, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0572, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0448, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0629, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0416, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0409, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0715, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0483, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0502, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0453, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0425, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0568, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0468, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0554, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0632, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0473, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0449, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0673, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0495, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0585, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0622, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0687, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0710, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0553, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0484, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0420, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0743, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0715, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0382, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0518, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0396, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0495, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0731, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0631, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0425, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0785, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0611, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0415, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0980, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0911, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0553, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0737, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0548, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0616, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0405, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0486, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0774, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0685, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1539, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0869, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0640, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0541, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0419, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0531, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0492, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0523, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0857, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0461, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0534, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0645, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0761, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0481, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0538, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0385, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0509, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0821, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0587, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0928, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0577, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0535, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0419, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1000, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0537, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0555, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0746, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0699, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0561, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0586, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0670, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0554, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0496, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0555, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0797, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0595, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0790, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0791, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0556, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1414, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0677, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0418, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0765, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0771, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0486, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0480, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1001, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0533, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0715, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0464, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0997, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0746, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0470, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0456, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0573, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0887, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0589, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0563, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.2843, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0406, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0494, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0832, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0802, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0425, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0460, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0392, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0822, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0590, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0698, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0830, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0438, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0404, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0483, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0520, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0746, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0387, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0736, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0881, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0641, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0515, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0597, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0661, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0428, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0488, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0551, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0536, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0869, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0655, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1935, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0722, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0506, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0650, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0693, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0636, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0494, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0563, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0683, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0442, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0910, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0808, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0495, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0731, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0418, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0687, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0479, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0612, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0684, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0709, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0541, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0846, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0541, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0412, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0407, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0950, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0501, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0391, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0445, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0640, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0876, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0464, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0556, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0691, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0672, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0864, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0522, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0664, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0424, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0857, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0670, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0535, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0981, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0491, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0492, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0512, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0653, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0558, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0539, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0725, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0636, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0670, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0568, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0512, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0429, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0530, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0656, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0552, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0583, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0500, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0677, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0664, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0407, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0385, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0655, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0496, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0393, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0550, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0647, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0697, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0536, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0568, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0540, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0926, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0453, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0927, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0526, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0646, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0596, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0692, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0405, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0448, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1631, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0463, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0606, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0419, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0951, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0733, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0886, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0694, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0438, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0877, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0476, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0809, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0521, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0493, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0379, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0628, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0670, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0866, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0421, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0412, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0475, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0408, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0505, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0426, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0913, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0395, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0498, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0538, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0604, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0401, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0867, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0790, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0648, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0603, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0389, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0577, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0917, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0658, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0397, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0904, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0457, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0444, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0531, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.2269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0610, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0766, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1004, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0601, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0499, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0457, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0423, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0769, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0598, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0488, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0616, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0464, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0651, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0442, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0389, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0471, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0419, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1021, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0635, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1991, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0537, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0624, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0634, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0594, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0702, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0890, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0768, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0699, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0420, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0648, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0488, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0419, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0642, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0431, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0516, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0802, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0577, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0848, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0398, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0417, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0755, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0607, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0596, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0430, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0648, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0632, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0849, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0580, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0482, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0440, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0420, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0500, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0389, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0862, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0475, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0422, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0593, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0539, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0406, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0416, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0531, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0542, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1432, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0423, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0383, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0579, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0762, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0736, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0394, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0577, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0408, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0906, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0434, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0509, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0496, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0451, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0421, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0496, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0482, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0462, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0823, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0837, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0442, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0532, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0615, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0578, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0421, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0508, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0382, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0458, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0835, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0383, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0402, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0619, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0534, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0615, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0790, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0421, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0568, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0432, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0557, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0607, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0525, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0525, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0671, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0427, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0596, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0756, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0469, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0561, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0448, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0555, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1669, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0634, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0827, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0430, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0607, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0417, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0529, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0446, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0750, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0402, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.2588, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1760, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0861, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0572, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0632, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0508, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0401, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0716, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0443, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0496, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0802, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0455, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0884, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0682, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0732, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.2111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0922, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0595, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0554, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0565, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0516, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0556, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0731, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0987, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0541, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0801, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0746, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0663, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0950, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0739, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0446, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0406, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0718, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0900, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0623, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0601, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0607, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0616, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.2383, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0505, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0528, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0637, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0428, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0818, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0428, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0416, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0440, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0481, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0416, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0453, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0746, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0517, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0542, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1394, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0418, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0476, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0525, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0449, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0571, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0686, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0582, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0700, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0654, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0771, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0765, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0502, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0531, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0717, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0436, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0574, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0506, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0602, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0455, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0604, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0793, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.2004, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0580, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0468, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0813, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0412, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0650, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.2013, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0394, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0559, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0642, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0705, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0426, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0429, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0628, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0427, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0588, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0927, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0471, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0480, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0576, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0499, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0383, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0631, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.2076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0625, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0568, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0390, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0410, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0569, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0714, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0727, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0702, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0441, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0435, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0400, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0406, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0709, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0490, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0436, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0721, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0496, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0642, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0461, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0525, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0545, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0699, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0441, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0490, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0450, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0396, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0843, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0705, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0797, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0555, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0741, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0758, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0549, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0808, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0720, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0466, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0519, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0919, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0474, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0628, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0492, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0817, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0407, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0474, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0615, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0811, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0399, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0901, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0535, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0490, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0512, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0527, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0581, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0606, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0388, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0599, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0575, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0394, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0426, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0612, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0732, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0459, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0398, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0403, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0416, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0459, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0628, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0444, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.2992, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0465, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0446, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0595, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.2271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0379, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0455, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0653, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0468, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0536, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0728, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0532, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0703, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0427, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0765, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0786, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0412, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0611, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0887, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0727, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0915, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0746, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0585, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0676, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1807, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0423, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0611, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0656, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0800, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0867, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0538, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0446, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0427, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0551, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0401, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0433, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0683, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0842, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0391, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0412, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0449, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0414, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0473, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0540, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0529, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0529, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0466, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0445, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0525, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0675, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0509, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0582, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0612, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0849, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0684, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0534, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0839, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0635, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0449, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0599, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0425, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0572, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0935, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0566, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1570, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0463, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0621, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0444, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0397, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0552, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0521, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0407, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0399, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0429, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0517, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0407, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0544, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0815, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0632, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0391, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0480, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0419, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0597, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0563, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0550, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0616, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0600, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0454, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0620, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0388, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0445, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0645, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0411, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0543, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0638, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0914, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0598, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0582, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0385, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0441, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0565, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0624, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0510, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0580, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0482, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1568, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0539, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0523, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0457, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0414, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0420, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0536, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0596, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0920, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0442, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0930, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0439, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0924, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0395, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0542, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0489, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0878, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0422, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0578, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0553, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0495, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0515, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0661, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0621, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0392, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0583, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0556, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0394, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0916, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0771, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0382, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0497, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0446, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0581, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0891, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0557, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0446, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0993, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0501, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0867, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0714, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0560, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0489, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0401, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0493, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1453, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0714, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0859, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0550, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0509, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0390, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0585, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0478, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0608, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0481, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0420, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0398, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0383, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0397, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0392, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0399, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0418, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0474, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0530, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0570, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0731, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0971, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0498, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0538, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0679, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0505, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0477, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0465, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0405, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0662, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0385, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0401, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0399, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0419, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0404, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0547, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0495, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0383, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0425, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0407, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0470, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0777, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0446, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0479, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0483, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0434, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0746, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0473, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0595, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0447, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0951, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0424, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0518, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0710, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0432, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0427, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0519, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0390, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0656, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0899, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0422, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0716, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0473, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0590, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0507, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0379, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0531, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0448, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0584, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0430, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0892, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0650, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0487, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0850, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0590, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0498, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0469, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0716, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0616, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0659, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0489, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0462, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0638, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0383, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0533, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0512, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0730, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0564, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0646, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0593, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1442, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0477, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0453, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0382, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0597, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0424, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0655, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0541, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0401, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0495, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0505, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0461, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0399, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0506, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0436, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0581, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0905, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0408, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0453, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0596, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0388, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0472, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0401, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0403, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0521, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0585, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0602, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0548, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0482, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0407, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0418, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0571, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0698, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0406, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0411, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0616, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0480, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0551, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0540, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0458, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0780, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0603, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0890, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0646, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0499, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0382, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0615, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0611, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0805, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0712, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0453, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0522, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0495, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0415, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0408, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0565, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0813, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0817, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0732, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.2204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0516, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0415, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0705, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0706, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0812, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0451, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0810, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0448, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0406, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0694, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0384, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0435, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0413, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0765, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0783, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0383, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0562, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0927, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0464, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0769, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0748, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0525, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0379, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0904, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0705, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0379, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0492, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0414, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0486, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0598, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0427, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0753, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1385, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0532, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0600, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0428, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0419, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0608, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0728, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0467, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0817, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0420, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0404, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0404, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0424, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0556, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0531, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0519, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0458, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0579, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0510, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0567, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0523, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1003, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0782, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0572, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0538, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0602, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0668, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0513, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0518, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0605, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0481, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0532, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0402, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0542, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0465, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0461, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0716, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0548, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0442, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0529, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0744, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0628, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0733, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0469, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0484, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0456, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0559, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0474, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0580, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0732, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0527, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0456, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0599, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0462, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0549, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0454, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0606, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0483, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0431, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0526, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0419, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0469, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0896, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0414, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0470, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0480, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1610, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0485, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0623, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0495, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0929, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0507, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0531, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0585, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0440, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0434, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0454, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0720, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0650, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0591, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0023, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0473, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0645, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0531, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0425, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0421, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0492, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0388, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0672, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0698, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0447, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0557, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0394, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0522, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0467, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0401, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0656, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0620, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0522, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0505, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0493, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0389, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0676, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0418, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0448, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0428, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0483, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0953, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0403, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0379, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0521, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0754, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0475, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0506, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0469, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0530, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0628, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0685, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0527, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0428, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0640, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0512, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0831, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0654, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0384, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0436, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0878, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0569, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0645, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0523, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0441, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0423, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0786, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0626, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0427, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0447, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0600, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0454, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1564, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0753, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0599, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0654, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0904, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0649, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0459, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0922, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0967, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0611, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0446, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0533, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0405, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0574, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0393, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0579, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0603, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0950, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0414, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0416, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0428, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0421, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0420, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0462, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0544, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0864, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0439, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0552, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0718, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0866, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0403, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0557, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0424, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0444, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0820, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0750, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0490, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0504, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0505, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0465, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0461, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0571, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0678, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0511, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0428, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0835, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0403, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0396, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0585, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0501, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0439, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0537, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0638, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0440, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0510, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0690, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0425, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0508, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0622, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0467, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0561, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0464, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0481, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0532, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0692, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0427, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0506, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0443, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0777, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0393, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0469, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0460, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0426, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0388, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0959, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0570, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0637, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0612, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0457, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0458, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0728, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0647, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0502, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0430, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0408, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0382, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0484, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0402, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0559, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0493, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0745, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0474, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0587, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0421, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0507, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0406, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0464, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0471, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0539, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0541, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0457, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0417, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0595, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0410, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1013, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0433, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0902, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1017, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0735, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0642, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0530, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0898, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0645, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0712, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0406, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0495, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0417, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0625, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0473, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0491, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0394, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0879, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0733, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0405, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0726, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0623, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0589, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0601, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0625, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0532, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0438, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0436, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0598, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0551, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0387, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0920, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0687, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0455, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0618, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0879, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0620, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1443, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0800, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0593, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0420, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0389, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0776, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0466, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1421, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0520, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0437, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0433, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0500, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0399, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0674, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0515, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1704, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0390, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0571, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0567, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0670, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0460, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0387, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0593, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0785, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0629, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0616, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0769, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0486, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0409, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0519, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0487, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0708, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0412, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0682, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0411, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0613, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0832, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0447, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0627, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0451, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0606, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0506, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0476, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0561, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0832, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0750, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0693, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0386, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0675, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0490, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0484, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0621, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0444, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0393, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0399, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0505, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0681, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0530, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0425, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0502, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0709, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0433, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0532, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0585, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0546, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1022, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0407, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0501, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0435, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0882, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0461, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0508, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0682, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0605, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0700, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0951, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0566, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0430, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0640, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0797, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0411, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0729, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0470, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0852, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0902, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0913, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0786, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0432, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0387, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0582, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0468, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0639, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0409, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0396, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0534, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0756, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0535, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0416, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0509, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0526, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0605, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0421, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0383, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0510, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0808, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0511, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0630, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0509, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0403, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0396, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1429, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0441, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0575, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0531, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0465, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0575, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0438, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0746, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0440, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0607, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0397, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0813, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0576, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0394, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0536, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0739, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0547, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0685, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0572, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0385, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0384, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0736, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0489, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0495, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0511, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0629, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0682, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0587, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0417, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0436, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0754, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0401, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0396, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0512, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0401, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0549, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0494, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0746, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0448, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0454, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0691, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1865, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0815, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0944, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0852, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0672, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0596, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0633, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0525, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0441, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0786, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0389, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0425, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0478, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0710, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0452, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0551, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0739, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0429, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0638, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0410, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0595, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0402, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0471, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0500, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0659, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0558, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0584, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0668, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0434, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0569, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0500, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0873, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0500, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0575, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0416, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0433, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0513, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0490, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0482, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0595, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0433, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0452, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0383, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0630, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0487, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0496, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0420, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0425, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0482, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0782, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0488, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0590, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0528, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0455, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0385, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0566, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0924, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0402, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0391, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0656, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0521, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0634, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0463, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0661, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0524, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0446, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0526, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0703, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0488, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0383, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0447, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0453, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0482, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0532, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0727, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0481, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0506, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0504, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0545, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0677, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0663, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0500, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0564, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0548, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0559, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0451, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0436, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0482, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0449, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0398, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0776, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0439, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0517, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.2001, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0522, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0566, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0396, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0414, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0400, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0603, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0388, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0573, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0459, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0406, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0466, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0662, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0603, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0543, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0395, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0521, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0423, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0400, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0392, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0416, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0801, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0594, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0912, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0396, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0452, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0654, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0654, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0495, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0984, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0483, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0758, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0956, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0789, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0465, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0475, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0504, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0439, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0651, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0751, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0499, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0391, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0430, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0627, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0591, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0595, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0470, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0423, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0419, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0609, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0721, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0524, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0632, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0674, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0433, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0510, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0621, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0421, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0679, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0590, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0443, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0941, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0584, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0497, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0726, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0535, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0404, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0576, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0433, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0661, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0417, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0681, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0759, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0540, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0709, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0393, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0538, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0821, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0385, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0397, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0428, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0685, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0388, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0442, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0516, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0499, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0387, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0449, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0668, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0553, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0587, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0477, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0593, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0464, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0416, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0738, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0408, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0480, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0889, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0430, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0431, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0475, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0670, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0419, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0668, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0957, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0519, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0406, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0680, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0415, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0635, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0624, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0674, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0501, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0574, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0398, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0466, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0780, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0502, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1019, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0399, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0384, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0401, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0422, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0473, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0602, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0385, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0530, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0427, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0491, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0457, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1018, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0628, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0421, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0414, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0669, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0745, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0460, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0562, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0819, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0664, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1513, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0391, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1461, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0514, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0573, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0512, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0441, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0422, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0773, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0538, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0442, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0542, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0388, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0454, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0658, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0410, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0457, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0594, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0631, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0539, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0577, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0422, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0725, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0741, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0541, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0478, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0561, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0404, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0657, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0608, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0912, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0379, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0409, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0737, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0650, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0638, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0796, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0483, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0486, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0505, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0410, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0508, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0464, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0627, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0421, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0455, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0536, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0465, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0463, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0497, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0486, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0568, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0447, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0855, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0513, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0522, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0716, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0596, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0499, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0655, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0386, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0389, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0712, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0498, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0525, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0566, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0573, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0747, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0617, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0404, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0466, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0382, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0550, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0447, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0660, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0418, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0388, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0672, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0506, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0829, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0426, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0423, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0458, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0923, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0411, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0472, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0408, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0399, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0520, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0514, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0836, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0497, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0385, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0505, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0598, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0452, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0525, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0434, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0406, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0465, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0610, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0628, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0695, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0663, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0627, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0561, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0624, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0521, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0612, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0819, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0546, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0436, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0473, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0606, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0421, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0556, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0864, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0420, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0447, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0695, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0386, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0415, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0462, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0596, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0482, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0723, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0418, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0414, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0517, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0579, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0410, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0498, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0386, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0462, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0757, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0611, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0697, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0428, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0399, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0587, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0609, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0386, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0577, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0677, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0386, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0555, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0448, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0591, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0641, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0516, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0407, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0618, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0451, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0652, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0598, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0750, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0632, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0578, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0394, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0452, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0773, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0570, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0756, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0419, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0466, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0867, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0528, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0847, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0453, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0664, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0428, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0596, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0449, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0637, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0934, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0461, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0544, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0487, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0475, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0498, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0602, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0615, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0450, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0473, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0575, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0418, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0386, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0416, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0392, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0582, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0450, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0446, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0524, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0656, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0402, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0569, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0730, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0390, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0780, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0685, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0475, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0492, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0475, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0563, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0678, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0572, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0523, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0473, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0421, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0486, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0552, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0530, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0526, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0789, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0462, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0412, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0399, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0431, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0560, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0507, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0384, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0405, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0459, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0549, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0391, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0623, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0602, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0525, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0420, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0436, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0392, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0416, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0935, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0593, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0638, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0452, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0504, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0510, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0395, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0394, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0519, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0438, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0605, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0568, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0585, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0569, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0447, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0507, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0400, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0482, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0500, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0495, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0496, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0441, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0396, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0633, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0756, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0515, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0828, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0682, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0506, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0579, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0415, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0407, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0506, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0522, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0648, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0979, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0575, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0993, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0698, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0502, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0452, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0597, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0442, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0520, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0679, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0551, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0420, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0382, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0409, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0668, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0555, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0415, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0618, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0523, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1000, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0551, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0565, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0517, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0585, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0799, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0838, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0827, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0741, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0452, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0787, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0482, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0512, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0474, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0384, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0440, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0726, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0692, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0429, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0449, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0457, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0420, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0414, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0585, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0903, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0506, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0450, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0406, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0384, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0504, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0641, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0683, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0537, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0429, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0572, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0566, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0493, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0590, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0592, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0392, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0490, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0474, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0420, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0517, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0401, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0489, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0413, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0635, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0389, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0475, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0403, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0401, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0567, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0651, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0546, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0484, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0474, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0505, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0770, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0429, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0873, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0453, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0405, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0394, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0499, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0584, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0736, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0488, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0425, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0386, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0576, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0469, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0493, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0725, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0496, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0500, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0570, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1005, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0487, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0449, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0502, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0674, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0538, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0417, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0504, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0410, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0589, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0586, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0890, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0552, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0397, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0447, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0388, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0714, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0641, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0388, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0738, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0555, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0568, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0517, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0785, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0913, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0722, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0582, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0405, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0404, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0488, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0509, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0525, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0536, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0496, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0656, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0453, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0659, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0474, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0565, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0510, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0725, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0736, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0460, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0411, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0500, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0535, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0668, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0723, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0410, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0485, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0712, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1435, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0951, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0491, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0503, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0458, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0509, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0681, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0719, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0569, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0401, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0522, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0556, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0491, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0473, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0557, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0531, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0529, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0603, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0680, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0727, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0452, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0838, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0696, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0801, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0493, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0521, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0454, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0837, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0385, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0469, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0459, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0761, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0630, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0531, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0538, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0501, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0430, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0460, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0528, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0567, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0386, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0736, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0562, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0486, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0650, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0446, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0427, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0507, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0659, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0449, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0427, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0736, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0468, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0435, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0643, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0516, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0398, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0396, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0651, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0732, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0753, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0417, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0524, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0440, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0714, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0967, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0388, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0482, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0432, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0393, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0476, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0477, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0446, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0576, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0473, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0401, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0558, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0572, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0744, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0528, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0597, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0486, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0542, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0477, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0615, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1020, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0406, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0397, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0408, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0558, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0585, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0439, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0523, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0563, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0451, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0458, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0465, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0477, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0379, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0466, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0552, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0638, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1019, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0488, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0510, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0430, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0586, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0792, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0565, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0707, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0486, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0488, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0408, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0756, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0611, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0418, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0436, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0771, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0447, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0851, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0430, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0572, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0494, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0382, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0419, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0585, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0658, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0390, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0908, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0471, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0677, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0505, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0385, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0543, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0494, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0509, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0569, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0399, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0598, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0598, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0870, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0956, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0389, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0638, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0490, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0502, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0795, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0752, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0602, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0430, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0399, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0463, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0586, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0960, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0642, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0413, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0483, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0611, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0406, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0515, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0446, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1014, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0606, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0697, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0450, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0498, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0518, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0597, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0663, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0441, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0423, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0379, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0889, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0577, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0965, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0449, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0934, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0551, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0461, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0392, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0664, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0642, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0469, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0427, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0903, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0667, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0448, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0955, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0475, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0407, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0617, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0636, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0471, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0536, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0514, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0379, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0505, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0549, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0729, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0498, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0457, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0437, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1002, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0569, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0939, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0506, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0535, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0459, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0382, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0396, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0479, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0963, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0549, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0436, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0448, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0572, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0747, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0648, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0818, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0549, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0538, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0423, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0486, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0641, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0673, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0494, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0771, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0735, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0394, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0409, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0713, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0878, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0659, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0461, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0383, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0462, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0455, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0761, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0956, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0652, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0628, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0594, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0583, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0691, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0680, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0555, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0417, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0408, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0746, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0518, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0433, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0493, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0403, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0458, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0407, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0719, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0654, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0577, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0442, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0429, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0397, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0387, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0441, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0540, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0392, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0481, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0397, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0677, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0564, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0408, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0608, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0392, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0660, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0408, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0545, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0442, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0463, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0410, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0382, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0546, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0465, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0405, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0408, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0526, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0424, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0386, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0507, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0450, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0502, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0581, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0635, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0437, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0489, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0606, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0398, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0423, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0631, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0544, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0411, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0751, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0414, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0593, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0458, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0623, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0674, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0546, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0469, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0386, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0640, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0616, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0680, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0474, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0415, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0427, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0482, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0498, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0406, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0639, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0785, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0643, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0379, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0403, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0385, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0922, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0513, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0511, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0454, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0516, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0556, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0910, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0435, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0603, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0513, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0701, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0421, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0437, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0384, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0487, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0510, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0482, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0462, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0715, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0503, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0522, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0764, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0479, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0662, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0570, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0486, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0386, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0416, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0526, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0608, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0427, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0534, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0450, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0652, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0549, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0504, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0590, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0537, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0537, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0424, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0660, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0429, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0995, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0620, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0430, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0606, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0764, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0766, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0464, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0521, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0467, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0799, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0465, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1001, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0425, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0395, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0735, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0461, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0571, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0585, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0427, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0689, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0449, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0449, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0799, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0500, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0917, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0478, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0498, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1475, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0474, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0583, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0450, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0852, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0577, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0410, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0624, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0578, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0414, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0433, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0409, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0660, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0568, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0536, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0504, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0456, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0700, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0449, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0417, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0408, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0476, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0432, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0937, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0489, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0447, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0493, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0443, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0389, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0389, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0855, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0400, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0443, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0567, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0411, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0444, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0669, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0447, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0442, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0598, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0598, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0929, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0386, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0398, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0667, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0804, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0446, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0511, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0398, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0431, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0484, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0699, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0582, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0455, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0428, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0386, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0794, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0436, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0648, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0481, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0617, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0519, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0835, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0698, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0422, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0512, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0639, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0559, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0496, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0411, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0675, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0615, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0422, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0606, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0485, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0441, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0560, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0447, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0640, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0588, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0505, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0713, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0414, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0622, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0410, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0601, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0449, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0496, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0530, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0471, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0389, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0584, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0438, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0502, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0439, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0417, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0425, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0605, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0485, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0409, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0716, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0575, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0403, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0531, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0718, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0620, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0640, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0383, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0399, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0417, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0385, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0485, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0861, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0678, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0701, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0552, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0832, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0517, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0382, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0406, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0574, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0569, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0622, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0683, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0442, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0464, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0428, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0417, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0954, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0512, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0419, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0445, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0600, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0667, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0447, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0891, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0411, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0529, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0504, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0656, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0562, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0409, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0394, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0529, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0525, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0548, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0685, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0581, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0534, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0423, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0478, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0467, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0464, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0566, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0484, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0473, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0437, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0763, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0572, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0515, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0412, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0513, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0423, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0460, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0457, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0389, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0636, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0525, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0437, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0616, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0601, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0415, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0499, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0535, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0416, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0629, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0723, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0773, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0468, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0437, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0445, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0486, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0567, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0412, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0516, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0644, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0399, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0514, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0524, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0407, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0924, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0548, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0629, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0406, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0748, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0899, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0564, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0760, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0428, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0502, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0473, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0391, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0435, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0501, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0607, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0464, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0401, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0404, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0525, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0432, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0426, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0385, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0438, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0557, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0487, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0467, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0808, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0383, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0589, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0456, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0415, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0491, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1023, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0455, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0882, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0506, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0388, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0795, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0430, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0547, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0776, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0455, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0637, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0385, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0865, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0401, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0514, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0506, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0654, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0447, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0411, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0449, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0398, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0560, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0608, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0412, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0440, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0385, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0649, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0691, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0545, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0409, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0386, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0424, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0383, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0388, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0464, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0494, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0398, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0557, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0620, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0432, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0485, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0429, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0435, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0423, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0452, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0482, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0516, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0774, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0571, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0497, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0420, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0397, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0410, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0450, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0884, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0475, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0571, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0533, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0452, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0385, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0933, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0575, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0578, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0428, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0659, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0601, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0634, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0410, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0724, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0875, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0796, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0526, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0992, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0399, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0751, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0730, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0653, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0486, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0571, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1020, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0483, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0392, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0506, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0428, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0431, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0438, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0463, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0963, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0396, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0446, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0681, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0545, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0393, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0480, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0490, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0508, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0720, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0379, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0501, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0725, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0397, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0445, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0391, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0828, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0445, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0527, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0396, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0547, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0536, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0643, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0464, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0455, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0436, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0444, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0786, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0536, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0393, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0388, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0631, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0757, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0415, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0393, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0387, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0532, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0454, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0628, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0655, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0404, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0397, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0597, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0759, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0922, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0868, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0791, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0395, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0514, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0528, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0414, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0601, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0407, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0897, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0415, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0489, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0708, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0403, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0659, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0413, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0570, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0480, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0454, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0478, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0407, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0982, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0604, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0749, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0470, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0404, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0460, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0450, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0433, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0465, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0562, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0637, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0386, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0453, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0390, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0661, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0522, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0488, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0523, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0915, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0418, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0404, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0655, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0497, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0450, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0616, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0482, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0799, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0793, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0641, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0542, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0486, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0414, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0393, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0399, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0833, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0432, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0513, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0411, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0435, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0384, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0478, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0967, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0426, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0433, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0493, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0473, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0521, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0388, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0454, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0441, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0593, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0628, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0608, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0709, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0855, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0472, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0555, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0555, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0384, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0448, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0689, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0697, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0699, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0470, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0578, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0471, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0433, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0499, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0542, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0553, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0433, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0621, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0543, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0622, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0610, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0585, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0592, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0491, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0398, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0703, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0548, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0529, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0578, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0526, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0715, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0522, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0548, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0529, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0434, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0755, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0752, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0429, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0431, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0462, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0491, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0486, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0901, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0640, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0463, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0619, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0422, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0650, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0529, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0578, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0446, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0796, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0421, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0396, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0961, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0523, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0555, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0520, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0456, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0772, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0594, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0616, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0715, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0651, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0423, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0503, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0817, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0726, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0710, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0436, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0667, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0886, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0393, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0395, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0615, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1393, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0508, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0640, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0737, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0379, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0772, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0658, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0593, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0429, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0532, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0699, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0697, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0394, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0430, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0527, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0417, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0418, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0674, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0444, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0581, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0445, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0782, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0596, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0392, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0587, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0720, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0498, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0557, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0546, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0464, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0526, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0652, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0556, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0398, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0763, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0460, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0639, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0379, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0842, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0577, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0397, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0395, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0420, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0419, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0495, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0543, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0438, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0430, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0459, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0506, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0457, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0503, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0461, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0573, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0514, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0544, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0500, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0394, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0407, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0645, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0582, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0401, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0621, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0445, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0849, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0873, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0532, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0463, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0714, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0814, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0747, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0746, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0863, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0398, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0480, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0641, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0526, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0447, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0465, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0594, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0780, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0899, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0523, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0463, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0722, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0482, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0483, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0425, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0429, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0612, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0966, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0704, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0504, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0472, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0428, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0894, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0411, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0990, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0438, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0395, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0515, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0553, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0460, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0675, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0660, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0677, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0564, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0461, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0474, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0494, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0486, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0513, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0392, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0555, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0530, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0544, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0544, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0567, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0697, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0555, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0468, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0577, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0521, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0502, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0503, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0519, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0653, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0483, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0511, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0475, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0534, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0582, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0699, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0410, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0697, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0623, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0524, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0421, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0528, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0434, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0457, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0403, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0747, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0472, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0404, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0504, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0592, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0595, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0482, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0840, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0409, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0525, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0520, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0451, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0777, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0536, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0513, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0417, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0436, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0529, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0641, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0527, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0807, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0544, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0547, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0519, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0539, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0682, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0471, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0505, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0999, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0926, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0450, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0499, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0397, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0392, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0430, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0413, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0547, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0439, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0710, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0505, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0511, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0617, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0548, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0700, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0395, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0485, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0616, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0514, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0490, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0461, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0451, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0579, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0550, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0409, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0664, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0471, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0418, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0628, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0478, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0454, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0505, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0540, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0620, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0533, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0670, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0023, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0515, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0469, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0403, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0554, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0489, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0383, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0436, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0458, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0735, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0475, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0623, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0828, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0384, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0443, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0687, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0613, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0735, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0897, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0607, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0562, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0392, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0522, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0406, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0413, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0503, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0512, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0450, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0950, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0762, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0582, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0639, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0379, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0492, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0528, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0561, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0662, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0559, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0429, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0715, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0390, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0399, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0493, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0732, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0478, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0628, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0726, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0438, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0587, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0393, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0401, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0829, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0458, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0420, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0432, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0418, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0406, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0492, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0463, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0436, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0727, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0520, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0528, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0983, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0409, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0396, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0398, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0576, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0417, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0671, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0492, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0400, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0430, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0396, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0447, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0399, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0391, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0704, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0598, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0394, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0611, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0675, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0577, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0459, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0491, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0428, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0763, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0455, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0856, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0388, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0491, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0483, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0441, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0424, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0598, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0472, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0503, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0391, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0483, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0539, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0606, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0672, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0441, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0646, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0544, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0522, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0626, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0398, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1000, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0857, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0384, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0414, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0386, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0537, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0717, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0396, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0641, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0496, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0697, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0424, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0602, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0604, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0463, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0481, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0522, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0652, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0462, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1925, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0569, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0408, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0484, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0416, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0465, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0390, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0439, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0608, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0401, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0472, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0597, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0610, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0481, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0615, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0420, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0409, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0737, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0619, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0681, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0775, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0582, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0396, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0658, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0386, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0460, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0506, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0702, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0554, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0537, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0469, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0540, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0425, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0448, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0760, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0417, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0464, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0496, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0513, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0521, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0479, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0979, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0914, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0507, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0773, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0528, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0446, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0394, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0533, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0594, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0526, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0438, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0673, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0528, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0430, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0663, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0557, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0379, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0584, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0613, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0446, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0553, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0606, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0677, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0572, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0483, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0487, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0629, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0461, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0728, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0464, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0461, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0444, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0608, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0497, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0894, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0572, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0463, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0862, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0429, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0589, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0597, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0489, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0545, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0813, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0462, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0518, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0440, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0579, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0432, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0575, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0569, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0433, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0495, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0569, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0508, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0415, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0537, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0414, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0485, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0403, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0510, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0510, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0533, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0434, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0388, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0384, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0427, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0429, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0385, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0401, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0506, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0515, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0385, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0634, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0620, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0602, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0431, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0734, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0526, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0404, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0571, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0469, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0563, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0562, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0541, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0394, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0737, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0574, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0461, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0608, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0651, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0603, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0396, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0448, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0537, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0453, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0500, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0501, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0603, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0539, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0433, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0769, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0458, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0386, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0526, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0993, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0414, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0391, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0603, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0408, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0663, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0391, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0549, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0445, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0395, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0811, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0430, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0461, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0411, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0445, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0474, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0583, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0646, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0665, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0456, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0610, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0577, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0458, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0498, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0598, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0534, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0495, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0627, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0444, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0558, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0455, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0389, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0620, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0712, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0479, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0785, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0417, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0531, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0470, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0469, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0584, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0881, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0574, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0471, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0390, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0443, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0549, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0846, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0413, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0526, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0730, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0430, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0490, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0490, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0463, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0819, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0382, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0612, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0457, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0442, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0791, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0402, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0590, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0450, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0715, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0748, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0656, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0457, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0423, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0662, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0509, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0382, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0552, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0556, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0548, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0522, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0621, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0482, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0527, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0635, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0421, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0565, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0388, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0566, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0405, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0605, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0974, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0449, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0699, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0583, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0441, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0470, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0413, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0505, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0610, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0550, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0520, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0783, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0803, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0674, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0547, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0397, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0442, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0500, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0411, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0563, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0386, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0479, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0560, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0861, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0561, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0417, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0402, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0500, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0438, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0521, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0410, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0449, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0387, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0572, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0662, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0663, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0729, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0493, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0961, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0454, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0550, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0585, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0460, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1002, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0683, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0511, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0546, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0416, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0493, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0428, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0532, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0503, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0418, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0389, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0625, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0617, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0389, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0448, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0772, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0542, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0618, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0694, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0515, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0425, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0405, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0468, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0434, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0920, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0425, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0643, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0401, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0757, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0488, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0468, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0892, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0528, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0827, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0474, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0447, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0427, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0435, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0403, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0996, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0477, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0597, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0420, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0435, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0862, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0624, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0726, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0483, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0669, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0462, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0468, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0509, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0411, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0494, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0400, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0560, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0443, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0831, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0537, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0563, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0475, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0520, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0801, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0816, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0395, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0665, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0691, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0557, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0742, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0462, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0704, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0459, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0407, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0732, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0446, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0483, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0529, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0569, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0452, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0545, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0408, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0451, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0409, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0830, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0569, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0457, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0750, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0562, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0636, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0937, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0502, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1467, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0486, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0462, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0400, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0439, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0585, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0432, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0510, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0460, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0434, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0495, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0632, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0417, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0649, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0529, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0458, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0472, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0498, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0541, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0435, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0481, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0768, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0628, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0394, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0597, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0395, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0421, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0401, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0497, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0501, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0388, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0477, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0386, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0678, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0442, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0537, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0836, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0620, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0614, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0424, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0449, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0432, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0399, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0517, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0459, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0998, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1004, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0668, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0506, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0471, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0624, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0623, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0729, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0757, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0438, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0409, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0549, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0801, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0463, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0457, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0959, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0507, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0492, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0432, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0434, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0404, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0422, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0484, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0482, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0467, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0457, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0802, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0508, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0575, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0388, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0495, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0428, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0416, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0457, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0479, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0454, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0507, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0470, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0534, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0515, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0781, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0477, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0573, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0485, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0543, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0496, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0516, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0384, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0444, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0411, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0407, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0736, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0592, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0563, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0478, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0808, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0712, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0526, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0750, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0382, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0430, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0859, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0447, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0473, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0556, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0605, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0433, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0484, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0408, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0397, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0547, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0409, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0424, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0507, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0578, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0461, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0661, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0498, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0412, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0493, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0478, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0508, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0728, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0495, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0423, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0669, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0393, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0515, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0510, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0451, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0415, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0396, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0537, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0559, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0575, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0633, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0477, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0540, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0390, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0452, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0461, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0868, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0660, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0472, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0500, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0690, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0511, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0842, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0442, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0393, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0608, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0528, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0448, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0447, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0431, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0891, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0587, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0554, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0397, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0477, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0409, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0855, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0749, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0415, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0680, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0557, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0659, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1025, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0703, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0818, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0664, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0435, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0748, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0447, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0595, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0832, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0797, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0456, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0608, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0384, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0624, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0468, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0406, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0548, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0432, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0463, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0382, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0458, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0433, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0388, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0504, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0401, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0600, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0515, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0457, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0397, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0592, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0415, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0458, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0460, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0430, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0435, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0678, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0425, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0481, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0582, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0444, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0736, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0545, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0520, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0424, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0836, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0458, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0419, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0459, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0471, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0622, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0857, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0402, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0458, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0520, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0648, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0498, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0399, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0793, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0515, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0481, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0484, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0433, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0506, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0419, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0462, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0396, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0425, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0436, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0529, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0389, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0435, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0443, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0456, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0398, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0472, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0614, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0545, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0428, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0594, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0458, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0630, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0441, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0530, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0384, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0481, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0468, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0412, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0608, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0442, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0427, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0606, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0514, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0441, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0594, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0582, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0426, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0604, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0382, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0487, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0436, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0395, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0488, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0431, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0430, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0792, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0451, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0576, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0592, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0430, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0616, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0384, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0607, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0393, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0405, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0587, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0687, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0660, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0494, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0827, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0565, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0766, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0625, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0412, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0414, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0726, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0776, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0527, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0478, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0458, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0710, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0475, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0669, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0495, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0454, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0577, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0737, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0474, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0472, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0448, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0522, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0397, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0674, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0587, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0379, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0682, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0466, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0398, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0477, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0434, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0436, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0821, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0532, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0628, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0641, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0729, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0493, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0570, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0474, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0578, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0619, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0492, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0447, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0457, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0417, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0582, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0434, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0601, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0417, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0480, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0474, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0440, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0504, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0549, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0529, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0579, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0402, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0508, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0595, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0561, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0472, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0632, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0504, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0537, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0500, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0516, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0536, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0451, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0561, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0628, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0703, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0441, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0573, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0489, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0489, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0383, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0501, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0533, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0696, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0407, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0592, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0499, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0602, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0428, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0404, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0576, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0462, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0604, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0753, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0738, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0474, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0584, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0464, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0634, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0663, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0668, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0423, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0427, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0654, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0575, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0421, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0503, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0590, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0400, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0411, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0421, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0590, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0517, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0417, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0492, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0631, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0891, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0516, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0974, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0499, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0545, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0560, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0478, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0974, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0507, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0497, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0628, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0588, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0464, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0577, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0542, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0473, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0490, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0543, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0468, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0580, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0562, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0685, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0457, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0759, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0405, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0543, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0490, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0920, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0474, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0422, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0607, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0747, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0429, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0555, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0505, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0399, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0585, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0676, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0603, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0587, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0461, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0705, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0937, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0599, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0615, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0390, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0543, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0394, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0768, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0548, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0419, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0438, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0433, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0524, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0531, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0453, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0412, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0624, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0581, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0411, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0433, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0487, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0620, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0454, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0518, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0474, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0696, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0520, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0448, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0438, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0400, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0466, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0605, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0486, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0419, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0461, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0522, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0538, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0456, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0516, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0557, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0548, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0437, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0494, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0530, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0494, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0877, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0496, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0780, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0409, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0459, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0459, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0609, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0748, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0448, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0488, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0023, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0429, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0590, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0387, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0509, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0452, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0581, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0690, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0499, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0407, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0656, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0548, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0438, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0513, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0450, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0384, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1398, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0429, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0474, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0620, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0489, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0438, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1008, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0601, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0388, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0533, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0597, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0675, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0759, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0385, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0414, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0751, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0612, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0812, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0416, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0712, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0393, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0433, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0495, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0627, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0395, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0429, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0508, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0521, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0684, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0544, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0478, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0798, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0724, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0407, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0427, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0444, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0731, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0398, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0408, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0596, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0396, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0398, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0399, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0689, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0523, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0464, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0419, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0426, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0749, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0554, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0385, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0504, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0453, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0465, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0387, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0554, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0525, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0645, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0530, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0460, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0488, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0396, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0449, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0445, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0582, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0762, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0514, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0395, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0447, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0459, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0712, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0400, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0529, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0565, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0454, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0526, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0774, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0510, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0925, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0491, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0477, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0456, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0843, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0435, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0726, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0533, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0426, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0473, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0569, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0386, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0460, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0522, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0754, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0614, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0404, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0400, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0505, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0413, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0450, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0490, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0805, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0414, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0466, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0379, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0607, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0473, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0718, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0657, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0654, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0726, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0513, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0669, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0571, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0469, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0465, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0541, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0382, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0448, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0386, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0491, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0426, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0780, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0597, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0491, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0532, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0597, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0426, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0394, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0736, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0558, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0477, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0508, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0763, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0874, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0742, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0451, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0474, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0396, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0543, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0510, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0510, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0445, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0567, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0942, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0472, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0395, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0438, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0491, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0533, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0384, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0603, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0673, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0585, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0460, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0578, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0569, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0594, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0534, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0392, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0684, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0439, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0557, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0500, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0615, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0435, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0838, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0748, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0397, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0457, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0386, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0441, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0791, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0465, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0437, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0450, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0471, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0620, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0434, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0449, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0608, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0419, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0493, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0421, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0557, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0455, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0426, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0572, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0417, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0856, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0599, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0416, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0469, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0562, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0432, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0473, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0532, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0646, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0464, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0743, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0618, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0432, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0411, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0415, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0465, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0589, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0492, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0552, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0523, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0543, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0608, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0440, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0413, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0532, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0724, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0512, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0392, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0434, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0552, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0530, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0401, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0596, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0472, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0395, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0422, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0413, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0393, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1004, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0474, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0411, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0467, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0428, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0409, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0449, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0426, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0645, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0390, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0406, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0437, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0499, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0673, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0385, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0567, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0383, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0387, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0656, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0389, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0402, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0411, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0777, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0458, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0400, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0394, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0404, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0536, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0460, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0422, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0439, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0464, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0486, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0539, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0391, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0505, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0636, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0470, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0382, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0572, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0388, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0427, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0397, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0639, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0401, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0491, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0389, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0404, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0612, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0592, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0427, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0452, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0512, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0784, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0420, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0472, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0564, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0637, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0612, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0650, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0576, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0406, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0519, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0428, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0431, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0861, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0445, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0405, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0431, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0638, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0563, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0406, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0751, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0534, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0499, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0629, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0437, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0415, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0558, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0690, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0457, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0587, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0485, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0476, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0485, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0410, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0651, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0493, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0480, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0609, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0482, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0486, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0382, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0673, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0443, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0570, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0513, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0557, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0499, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0386, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0404, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0550, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0455, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0459, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0403, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0569, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0505, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0386, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0562, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0408, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0421, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0416, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0400, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0578, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0423, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0635, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0751, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0661, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0932, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0512, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0517, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0564, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0455, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0567, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0514, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0473, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0630, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0442, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0437, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0733, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0498, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0951, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0471, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0770, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0523, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0868, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0685, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0417, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0541, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0546, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0803, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0389, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0391, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0397, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0398, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0427, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0521, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0700, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0544, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0689, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0431, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0510, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0621, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0460, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0545, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0386, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0499, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0533, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0432, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0430, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0587, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0414, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0863, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0626, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0565, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0486, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0406, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0504, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0535, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0513, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0392, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0391, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0379, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0633, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0442, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0567, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0483, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0621, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0490, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0474, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0413, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0423, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0885, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0618, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0614, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0416, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0486, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0445, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0023, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0385, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0409, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0715, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0507, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0593, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0486, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0759, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0389, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1532, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0706, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0457, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0621, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0454, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0587, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0433, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0689, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0430, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0467, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0488, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0412, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0406, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0921, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0478, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0423, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0488, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0482, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0449, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0518, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0484, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0568, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0627, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0440, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0512, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0392, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0431, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0756, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0420, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0453, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0418, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0658, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0615, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0449, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0488, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0686, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0902, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0406, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0389, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0728, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0491, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0406, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0437, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0483, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0541, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0747, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0506, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0415, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0410, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0596, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0556, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0597, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0997, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0737, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0401, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0406, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0435, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0435, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0584, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0403, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0799, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0422, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0771, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0425, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0434, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0569, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0622, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0430, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0396, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0407, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0421, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0499, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0532, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0472, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0556, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0514, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0507, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0426, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0551, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0668, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0865, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0747, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0392, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0578, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0488, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0480, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0433, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0560, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0546, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0482, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0604, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0510, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0996, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0455, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0413, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0771, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0550, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0730, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0401, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0530, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0464, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0406, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0633, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0645, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0393, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0421, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0426, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0590, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0513, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0442, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0422, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0408, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0379, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0415, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0445, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0917, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0619, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0529, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0452, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0410, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0603, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0545, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0618, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0502, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0424, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0405, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0415, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0431, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0625, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0503, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0473, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0801, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0587, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0617, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0568, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0568, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0471, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0575, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0668, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0431, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0417, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0685, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0468, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0597, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0407, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0919, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0481, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0391, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0543, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0863, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0389, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0535, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0390, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0485, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0622, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0490, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0513, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0523, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0631, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0379, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0430, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0442, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0825, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0566, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0540, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0427, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0421, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0022, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0640, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0383, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0567, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0425, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0476, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0681, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0446, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0680, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0761, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0430, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0448, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0437, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0532, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0493, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0471, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0471, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0732, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0591, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0689, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0456, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0521, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0423, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0389, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0644, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0393, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0516, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0461, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0481, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0544, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0393, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0427, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0724, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0867, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0828, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0788, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0399, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0774, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0905, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0821, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0475, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0524, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0531, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0547, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0830, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0522, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0698, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0488, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0541, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0448, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0493, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0425, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0441, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0458, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0629, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0569, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0407, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0574, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0449, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0512, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0614, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0443, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0572, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0592, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0474, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0458, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0390, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0439, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0603, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0390, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0400, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0501, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0483, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0437, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0562, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0418, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0483, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0736, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0706, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0453, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0465, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0471, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0424, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0674, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0398, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0601, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0655, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0746, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0478, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0616, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0635, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0396, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0550, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0560, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0730, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0628, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0878, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0646, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0428, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0480, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0459, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0560, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0450, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0769, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0447, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0418, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0486, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0487, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0549, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0463, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0778, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0447, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0566, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0816, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0649, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0697, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0407, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0489, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0503, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0544, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0549, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0463, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0768, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0782, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0516, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0757, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0643, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0472, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0575, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0467, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0548, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0685, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0618, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0542, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0627, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0479, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0588, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0458, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0807, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0396, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0401, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0458, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0403, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0428, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0423, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0664, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0439, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0389, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0464, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0576, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0397, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0685, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0550, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0396, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0395, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0475, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0454, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0541, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0393, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0437, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0516, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0776, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0408, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0387, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0379, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0475, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0485, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0480, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0470, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0398, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0394, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0486, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0763, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0448, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0416, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0426, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0386, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0411, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0406, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0414, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0475, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0451, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0469, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0530, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0791, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0606, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0396, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0445, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0481, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0541, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0440, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0412, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0555, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0587, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0523, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0404, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0453, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0394, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0510, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0498, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0461, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0400, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0523, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0665, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0705, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0466, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0474, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0408, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0469, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0545, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0604, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0379, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0645, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0714, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0430, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0554, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0545, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0558, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0461, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0699, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0394, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0428, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0622, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0389, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0434, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0379, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0428, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0770, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0801, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0424, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0548, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0390, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0662, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0536, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0449, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0551, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0468, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0475, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0540, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0585, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0510, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0433, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0417, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0501, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0460, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0418, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0422, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0422, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0401, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0475, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0489, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0486, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0452, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0432, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0699, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0457, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0631, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0400, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0439, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0641, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0397, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0547, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0535, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0516, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0498, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0611, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0389, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0546, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0594, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0580, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0633, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0546, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0398, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0553, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0532, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0425, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0379, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0424, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0690, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0468, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0504, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0413, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0573, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0471, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0718, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0396, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0512, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0570, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0624, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0455, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0393, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0639, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0388, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0449, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0393, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0617, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0492, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0598, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0391, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0470, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0683, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0470, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0482, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0686, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0506, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0489, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0408, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0738, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0477, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0494, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0660, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0403, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0864, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0558, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0507, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0422, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0737, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0459, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0468, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0434, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0413, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0405, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0609, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0497, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0619, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0637, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0412, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0431, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0401, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0564, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0513, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0470, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0392, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0391, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0436, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0755, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0438, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0525, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0505, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0728, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0390, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0536, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0488, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0525, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0396, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0593, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0618, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0459, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0390, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0469, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0510, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0434, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0507, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0545, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0568, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0395, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0444, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0474, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0612, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0503, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0462, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0470, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0431, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0488, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0607, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0478, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0830, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0536, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0630, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0420, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0428, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1390, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0698, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0401, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0566, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0524, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0505, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0575, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0527, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0383, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0480, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0506, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0440, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0413, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0432, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0760, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0573, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0415, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0475, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0474, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0527, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0500, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0500, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0480, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0389, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0394, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0490, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0466, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0562, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0451, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0410, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0581, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0390, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0571, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0682, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0477, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0468, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0415, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0382, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0675, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0464, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0462, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0485, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0516, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0551, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0448, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0417, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0525, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0549, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0426, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0619, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0424, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0454, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0392, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0569, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0386, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0441, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0404, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0512, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0405, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0389, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0437, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0456, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0467, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0586, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0426, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0461, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0579, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0691, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0603, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0437, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0487, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0574, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0401, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0489, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0783, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0619, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0382, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0419, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0394, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0480, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0394, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0525, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0667, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0552, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0432, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0437, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0553, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0511, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0664, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0423, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0403, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0591, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0489, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0441, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0385, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0529, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0606, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0441, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0462, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0632, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0512, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0578, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0540, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0444, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0583, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0387, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0479, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0385, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0557, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0560, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0440, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0667, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0426, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0647, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0629, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0556, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0415, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0559, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0698, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0619, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0496, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0467, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0627, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0599, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0386, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0550, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0558, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0631, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0682, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0444, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0521, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0568, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0379, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0637, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0498, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0707, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0423, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0560, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0649, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0442, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0548, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0456, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0485, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0577, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0440, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0751, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0393, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0642, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0474, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0419, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0443, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0488, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0481, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0547, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0473, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0502, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0558, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0396, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0519, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0551, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0457, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0419, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0416, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0460, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0422, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0385, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0426, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0399, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0424, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0506, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0412, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0426, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0524, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0576, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0598, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0409, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0417, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0821, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0460, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0445, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0421, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0485, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0555, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0400, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0428, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0561, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0535, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0424, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0809, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0469, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0507, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0520, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0568, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0427, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0556, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0656, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0446, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0419, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0396, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0765, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0428, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0554, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0495, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0499, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0738, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0435, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0451, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0561, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0441, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0458, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0873, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0622, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0448, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0408, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0541, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0430, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0467, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0532, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0425, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0496, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0667, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0726, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0446, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0521, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1433, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0544, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0558, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0495, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0775, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0832, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0763, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0402, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0434, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0598, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0394, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0469, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0595, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0561, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0856, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0418, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0726, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0466, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0468, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0518, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0452, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0468, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0390, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0648, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0432, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0516, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0576, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0444, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0495, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0394, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0628, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0486, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0609, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0419, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0572, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0511, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0590, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0389, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0440, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0407, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0413, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0419, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0508, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0459, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0510, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0449, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0500, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0644, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0384, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0379, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0610, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0417, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0395, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0513, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0921, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0933, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0477, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0478, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0472, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0501, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0455, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0382, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0431, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0422, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0518, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0463, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0659, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0403, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0619, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0550, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0387, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0472, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0427, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0625, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0448, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0563, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0501, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0895, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0400, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0496, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0812, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0578, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0688, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0506, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0434, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0515, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0441, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0515, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0616, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0458, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0657, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0394, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0623, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0391, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0578, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0449, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0519, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0518, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0639, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0428, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0651, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0396, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0586, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0447, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0406, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0482, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0790, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0461, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0468, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0394, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0477, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0494, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0423, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0493, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0469, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0468, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0413, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0659, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0417, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0675, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0568, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0609, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0577, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0487, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0466, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0416, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0379, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0516, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0516, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0418, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0639, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0412, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0466, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0717, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0641, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0642, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0394, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0450, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0500, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0700, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0613, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0407, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0437, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0492, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0715, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0592, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0387, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0591, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0439, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0659, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0561, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0436, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0389, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0410, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0382, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0495, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0568, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0604, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0420, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0670, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0530, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0672, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0535, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0499, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0419, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0512, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0408, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0849, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0569, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0517, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0420, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0379, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0487, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0389, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0409, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0570, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0495, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0461, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0505, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0532, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0021, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0593, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0418, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0509, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0454, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0411, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0406, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0475, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0430, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0507, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0579, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0407, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0571, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0465, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0613, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0509, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0620, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0470, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0636, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0592, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0383, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0663, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0555, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0605, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0416, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0430, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0558, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0463, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0396, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0479, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0424, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0553, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0439, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0491, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0566, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0910, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0449, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0548, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0542, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0633, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0457, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0398, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0550, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0460, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0599, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0411, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0567, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0682, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0611, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0478, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0967, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0379, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0468, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0391, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0461, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0551, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0414, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0402, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0434, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0453, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0590, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0395, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0541, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0581, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0500, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0419, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0398, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0645, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0418, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0422, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0884, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0782, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0554, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0394, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0517, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0522, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0598, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0469, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0385, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0466, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0514, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0519, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0560, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0415, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0948, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0521, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0405, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0382, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0488, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0418, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0393, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0566, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0412, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0385, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0874, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0394, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0634, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0393, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0411, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0728, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0433, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0641, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0475, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0504, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0654, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0490, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0440, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0501, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0568, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0423, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0426, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0468, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0467, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0460, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0445, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0447, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0396, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0438, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0673, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0512, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0653, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0785, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0398, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0429, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0497, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0434, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0443, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0379, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0437, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0379, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0527, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0402, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0452, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0434, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0472, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0385, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0605, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0407, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0557, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0447, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0522, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0516, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0511, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0645, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0487, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0392, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0493, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0402, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0436, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0526, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1428, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0404, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0508, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0662, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0644, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0426, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0557, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0387, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0563, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0416, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0583, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0427, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0771, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0449, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0758, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0521, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0504, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0431, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0460, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0498, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0397, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0385, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0608, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0409, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0727, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0725, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0567, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0544, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0497, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0389, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0698, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0509, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0496, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0551, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0524, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0461, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0484, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0430, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0399, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0473, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0602, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0653, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0474, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0420, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0931, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0398, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0452, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0384, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0436, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0410, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0387, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0470, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0453, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0472, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0447, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0480, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0698, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0485, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0436, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0415, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0727, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0606, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0461, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0438, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0408, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0419, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0696, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0555, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0579, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0636, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0708, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0428, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0429, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0635, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0379, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0537, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0543, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0663, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0397, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0755, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0419, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0965, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0720, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0397, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0452, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0503, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0429, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0382, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0688, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0663, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0397, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0461, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0931, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0424, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0403, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0431, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1464, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0448, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0541, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0548, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0448, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0798, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0539, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0420, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0537, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0679, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0485, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0548, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0416, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0543, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0487, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0391, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0616, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0631, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0522, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0446, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0393, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0395, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0384, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0512, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0470, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0443, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0425, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0444, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0383, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0527, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0418, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0434, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0382, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0462, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0531, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0396, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0482, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0392, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0626, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0421, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0719, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0463, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0635, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0580, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0529, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0643, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0524, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0539, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0414, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0545, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0462, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0623, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0389, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0796, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0482, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0485, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0387, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0404, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0465, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0393, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0447, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0397, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0544, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0491, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0452, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0437, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0768, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0434, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0460, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0408, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0582, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0560, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0532, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0390, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0512, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0509, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0692, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0596, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0518, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0564, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0493, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0442, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0512, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1011, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0453, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0459, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0594, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0507, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0456, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0517, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0679, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0448, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0393, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0828, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0485, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0387, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0455, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0487, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0635, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0474, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0569, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0393, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0562, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0745, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0644, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0475, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0528, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0541, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0405, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0460, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0654, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0390, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0414, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0547, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0398, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0533, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0419, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0717, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0572, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0478, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0451, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0543, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0388, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0401, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0495, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0599, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0537, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0418, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0423, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0709, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0411, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0481, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0467, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0546, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0613, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0495, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0400, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0702, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0500, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0404, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0440, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0512, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0674, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0773, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0893, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0387, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0500, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0393, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0500, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0448, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0535, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0383, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0409, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0557, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0396, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0555, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0391, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0633, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0426, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0416, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0704, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0405, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0486, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0658, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0385, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0565, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0402, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0419, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0598, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0484, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0486, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0711, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0704, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0477, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0537, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0672, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0439, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0647, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0508, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0856, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0398, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0468, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0473, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0491, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0406, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0568, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0467, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0402, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0511, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0429, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0473, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0578, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0449, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0390, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0453, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0483, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0461, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0598, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0560, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0471, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0475, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0456, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0379, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0517, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0617, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0514, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0475, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0416, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0511, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0833, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0548, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0469, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0523, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0430, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0556, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0383, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0482, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0396, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0453, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0765, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0585, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0626, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0442, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0552, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0465, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0383, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0397, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0573, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0404, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0742, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0484, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0537, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0536, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0579, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0465, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0490, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0506, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0409, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0412, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0433, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0782, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0391, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0681, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0512, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0397, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0413, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0390, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0389, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0388, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0533, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0498, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0644, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0493, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0437, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0638, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0410, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0423, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0387, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0440, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0476, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0583, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0540, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0653, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0604, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0487, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0704, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0736, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0613, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0467, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0486, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0626, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0538, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0620, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0536, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0429, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0416, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0445, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0592, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0553, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0403, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0391, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0379, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0492, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0413, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0548, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0655, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0408, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0471, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0738, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0389, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0498, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0414, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0384, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0542, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0565, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0619, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0399, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0488, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0379, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0454, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0812, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0426, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0431, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0684, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0379, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0542, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0408, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0387, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0468, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0496, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0419, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0437, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0652, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0640, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0403, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0416, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0492, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0394, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0433, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0402, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0466, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0498, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0453, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0399, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0472, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0401, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0537, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0434, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0476, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0416, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0554, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0506, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0389, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0621, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0458, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0720, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0497, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0440, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0397, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0401, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0446, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0417, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0382, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0544, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0393, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0386, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0648, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0485, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0438, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0945, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0391, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0413, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0418, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0452, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0510, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0391, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0432, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0740, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0715, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0536, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0482, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0586, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0472, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0387, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0392, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0456, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0525, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0418, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0760, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0845, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0540, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0421, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0448, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0722, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0407, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0472, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0385, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0577, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0387, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0517, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0638, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0385, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0510, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0465, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0432, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0460, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0410, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0757, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0583, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0485, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0655, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0927, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0408, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0460, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0420, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0526, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0404, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0906, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0631, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0549, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0722, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0458, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0456, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0554, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0534, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0651, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0762, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0441, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0546, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0416, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0611, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0551, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0728, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0516, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0420, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0577, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0541, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0484, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0524, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0515, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0845, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0597, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0393, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0598, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0471, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0409, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0604, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0471, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0395, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0446, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0504, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0744, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0426, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0470, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0530, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0436, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0428, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0763, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0598, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0480, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0497, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0438, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0572, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0857, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0390, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0438, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0574, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0453, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0485, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0493, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0582, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0483, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0501, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0422, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0795, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0751, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0513, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0403, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0448, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0386, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0464, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0498, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0420, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0398, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0427, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0406, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0496, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0432, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0495, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0434, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0559, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0487, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0529, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0506, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0674, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0487, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0454, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0395, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0640, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0488, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0387, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0677, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0697, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0684, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0439, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0720, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0498, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0465, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0542, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0500, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0403, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0870, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0458, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0603, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0608, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0428, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0419, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0392, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0385, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0410, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0546, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0404, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0602, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0680, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0560, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0396, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0624, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0697, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0486, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0572, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0490, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0634, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0811, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0557, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0762, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0705, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0465, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0492, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0483, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0420, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0477, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0464, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0638, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0573, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0391, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0410, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0552, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0389, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0460, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0518, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0655, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0756, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0544, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0382, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0457, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0588, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0544, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0830, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0484, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0382, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0631, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0535, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0413, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0718, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0421, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0500, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0408, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0642, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0565, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0401, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0427, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0412, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0498, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0463, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0496, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0470, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0580, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0386, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0401, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0782, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0699, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0494, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0386, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0439, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0626, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0532, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0420, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0439, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0467, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0427, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0488, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0527, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0379, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0438, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0540, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0425, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0552, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0391, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0533, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0522, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0564, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0439, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0441, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0417, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0400, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0443, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0461, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0888, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0543, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0428, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0806, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0959, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0396, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0439, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0504, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0898, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0535, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0434, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0490, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0980, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0638, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0594, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0415, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0423, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0506, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0499, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0497, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0647, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0444, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0506, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0566, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0483, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0402, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0605, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0403, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0504, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0432, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0399, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0422, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0845, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0422, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0494, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0423, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0493, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0423, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0961, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0934, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0641, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0496, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0502, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0484, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0660, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0504, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0516, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0432, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0474, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0914, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0542, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0426, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0411, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0691, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0837, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0386, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0405, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0601, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0420, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0486, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0546, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0418, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0588, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0566, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0408, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0406, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0407, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0477, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0379, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0557, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0458, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0722, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0492, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0480, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0393, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0411, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0437, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0895, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0488, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0546, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0411, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0482, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0498, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0417, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0578, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0399, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0390, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0383, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0574, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0387, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0487, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0420, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0483, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0410, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0811, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0428, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0600, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0470, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0804, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0431, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0584, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0399, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0488, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0610, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0810, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0411, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0476, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0524, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0451, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0658, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0695, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0423, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0447, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0499, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0565, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0617, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0547, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0412, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0738, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0648, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0503, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0392, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0421, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0680, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0440, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0404, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0523, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0515, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0421, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0483, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0379, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0430, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0387, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0457, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0392, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0614, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0561, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0454, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0484, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0501, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0434, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0599, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0397, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0499, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0481, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0411, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0543, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0625, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0654, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0411, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0494, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0537, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0625, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0744, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0416, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0391, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0444, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0460, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0500, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0460, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0407, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0658, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0632, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0602, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0443, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0595, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0435, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0620, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0612, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0537, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0463, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0419, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0412, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0451, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0438, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0574, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0455, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0469, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0404, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0418, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0396, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0501, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0599, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0461, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0450, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0449, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0544, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0391, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0429, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0398, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0408, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0482, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0404, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0415, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0438, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0423, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0516, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0624, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0537, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0460, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0479, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0443, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0755, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0542, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0387, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0511, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0626, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0456, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0689, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0427, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0492, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0439, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0493, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0606, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0561, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0495, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0402, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0413, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0556, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0822, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0416, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0444, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0437, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0396, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0447, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0722, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0740, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0443, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0493, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0397, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0407, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0395, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0637, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0469, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0399, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0565, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0555, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0587, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0930, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0595, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0416, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0423, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0008, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0414, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0395, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0444, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0441, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0584, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0487, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0601, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0660, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0420, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0468, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0778, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0397, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0384, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0744, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0614, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0398, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0407, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0955, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0441, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0428, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0437, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0463, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0466, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0550, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0642, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0624, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0676, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0827, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0578, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0424, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0529, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0418, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0493, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0543, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0403, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0383, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0387, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0685, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0491, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0552, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0402, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0545, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0463, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0397, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0437, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0502, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0488, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0552, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0471, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0572, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0585, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0425, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0457, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0418, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0608, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0574, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0441, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0688, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0433, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0464, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0646, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0666, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0615, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0441, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0547, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0470, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0445, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0422, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0458, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0452, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0538, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0417, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0590, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0379, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0403, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0501, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0661, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0568, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0422, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0513, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0752, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0600, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0682, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0450, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0397, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0404, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0543, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0410, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0517, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0489, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0399, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0485, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0405, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0521, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0495, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0495, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0403, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0019, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0560, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0452, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0423, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0616, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0420, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0502, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0429, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0432, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0407, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0547, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0384, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0561, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0580, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0586, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0474, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0542, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0491, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0648, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0407, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0433, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0745, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0516, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0959, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0527, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0944, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0629, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0546, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0412, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0452, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0727, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0416, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0432, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0431, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0780, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0623, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0462, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0419, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0461, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0664, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0583, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0548, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0023, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0563, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0409, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0604, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0496, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0416, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0426, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0506, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0846, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0406, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0442, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0404, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0426, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0452, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0444, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0397, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0586, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0467, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0563, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0867, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0593, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0403, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0382, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0393, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0431, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0619, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0520, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0514, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0582, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0407, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0513, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0494, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0511, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0414, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0417, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0550, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0460, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0393, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0406, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0462, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0392, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0489, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0454, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0632, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0428, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0574, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0637, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0405, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0462, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0404, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0748, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0012, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0514, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0435, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0390, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0812, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0490, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0416, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0431, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0527, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0648, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0551, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0452, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0383, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0575, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0476, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0428, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0608, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0405, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0405, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0427, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0531, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0437, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0535, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0393, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0494, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0440, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0567, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0735, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0830, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0425, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0463, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0394, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0586, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0504, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0617, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0535, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0463, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0444, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0673, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0402, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0452, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0478, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0927, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0396, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0476, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0620, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0486, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0399, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0389, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0384, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0387, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0516, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0496, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0490, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0429, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0491, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0416, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0391, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0444, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0547, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0554, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0435, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0386, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0402, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0441, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0665, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0458, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0498, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0703, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0514, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0444, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0497, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0401, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0503, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0385, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0538, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0556, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0010, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0429, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0009, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0011, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0578, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0020, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0544, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0400, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0412, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0409, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0509, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0898, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0460, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0805, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0769, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0694, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0442, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0736, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0379, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0478, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0418, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0465, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0581, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0425, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0435, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0602, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0716, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0549, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0648, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0458, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0492, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0505, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0681, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0012, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0631, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0667, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0410, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0871, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0405, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0493, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0395, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0399, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0457, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0488, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0459, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0551, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0385, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0516, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0463, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0598, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0446, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0403, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0973, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0457, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0581, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0400, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0441, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0419, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0456, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0429, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0006, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0444, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0422, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0488, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0429, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0431, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0432, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0568, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0477, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0560, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0441, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0403, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0505, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0447, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0645, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0491, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0791, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0023, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0406, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0434, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0006, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0499, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0827, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0517, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0903, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0576, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0390, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0459, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0491, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0432, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0452, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0394, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0477, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0412, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0529, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0579, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0395, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0514, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0473, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0475, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0501, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0750, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0391, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0826, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0588, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0382, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0591, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0819, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0445, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0005, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0585, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0409, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0425, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0407, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0449, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0385, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0411, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0405, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0451, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0437, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0612, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0613, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0516, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0835, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0498, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0023, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0385, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0505, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0774, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0488, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0005, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0524, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0511, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0568, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0441, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0832, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0428, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0005, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0023, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0492, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0970, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0386, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0432, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0448, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0462, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0627, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0904, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0548, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0416, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0404, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0384, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0494, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0013, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0716, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0940, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0387, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0635, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0396, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0449, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0464, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0394, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0386, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0892, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0395, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0456, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0536, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0497, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0608, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0423, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0387, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0426, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0472, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0397, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0532, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0467, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0730, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0450, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0493, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0441, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0586, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0627, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0483, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0479, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0813, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0392, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0420, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0459, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0399, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0005, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0005, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0491, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0453, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0420, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0472, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0446, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0422, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0423, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0448, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0452, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0482, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0716, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0666, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0474, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0468, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0410, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0644, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0814, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0423, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0448, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0396, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0596, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0464, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0021, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0397, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0453, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0006, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0005, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0005, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0013, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1020, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0631, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0004, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0004, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0017, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0408, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0570, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0450, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0469, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0462, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0389, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0555, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0488, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0496, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0775, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0448, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0448, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0753, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0599, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0532, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0520, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0388, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0555, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0438, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0400, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0535, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0400, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0546, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0393, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0389, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0396, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0651, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0407, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0387, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0414, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0723, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0853, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0433, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0568, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0443, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0487, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0387, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0474, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0438, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0967, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0004, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0573, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0014, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0004, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0477, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0014, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0004, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0691, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0021, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0433, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0505, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0913, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0413, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0783, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0487, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0488, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0397, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0441, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0489, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0438, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0483, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0415, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0530, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0405, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0409, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0432, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0407, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0582, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0004, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0416, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0492, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0851, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0008, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0004, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0004, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0437, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0018, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0004, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0007, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0647, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0004, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0016, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0391, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0009, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0405, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0727, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0760, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0379, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0538, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0479, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0413, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0423, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0389, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0464, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0638, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0413, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0395, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0402, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0558, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0508, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0545, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0491, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0506, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0561, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0440, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0853, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0576, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0407, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0460, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0397, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0500, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0397, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0740, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0557, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0513, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0456, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0740, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0413, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0631, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0477, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0386, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0441, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0736, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0633, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0626, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0427, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0452, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0424, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0817, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0382, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0615, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0604, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0451, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0407, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0430, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0479, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0409, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0430, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0405, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0534, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0392, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0387, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0994, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0636, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0473, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0492, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0503, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0543, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0509, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0410, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0471, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0571, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0638, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0463, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0601, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0406, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0437, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0403, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0396, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0398, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0557, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0566, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0004, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0396, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0383, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0392, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0678, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0410, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0421, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0004, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0003, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0568, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0429, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0423, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0508, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0647, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0388, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0759, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0511, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0485, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0386, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0574, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0827, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0398, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0447, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0383, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0578, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0544, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0431, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0491, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0653, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0524, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0738, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0437, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0484, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0478, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0432, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0577, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0390, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0665, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0487, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0501, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0517, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0445, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0465, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0626, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0466, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0464, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0511, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0738, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0634, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0004, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0536, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0509, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0576, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0625, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0417, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0479, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0010, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0541, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0643, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0448, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0609, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0466, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0433, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0424, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0388, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0405, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0572, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0812, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0643, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0421, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0443, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0441, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0918, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0400, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0679, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0542, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0422, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0020, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0501, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0472, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0493, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0616, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0696, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0690, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0444, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0383, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0570, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0447, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0570, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0393, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0463, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0418, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0456, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0533, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0578, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0504, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0567, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0421, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0661, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0384, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0614, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0652, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0567, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0708, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0390, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0448, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0474, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0879, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0887, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0726, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0414, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0494, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0424, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0498, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0484, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0427, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0385, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0397, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0766, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0623, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0431, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0510, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0394, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0402, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0708, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0618, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0739, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0545, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0457, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0394, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0382, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0401, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0452, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0561, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0580, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0510, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0495, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0519, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0683, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0417, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0554, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0595, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0508, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0426, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0607, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0777, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0776, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0494, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0390, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0470, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0504, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0461, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0545, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0440, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0399, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0826, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0527, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0606, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0414, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0504, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0421, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0531, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0511, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0643, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0480, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0405, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0424, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0400, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0471, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0606, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0602, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0565, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0572, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0437, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0486, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0491, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0529, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0791, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0418, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0554, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0389, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0396, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0517, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0020, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0414, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0459, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0537, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0619, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0424, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0539, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0506, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0569, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0474, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0429, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0603, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0427, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0465, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0407, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0535, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0839, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0541, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0531, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0423, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0418, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0423, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0397, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0626, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0601, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0943, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0799, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0414, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0430, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0399, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0551, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0677, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0595, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0569, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0025, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0428, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0433, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0462, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0491, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0506, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0575, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0442, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0763, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0733, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0417, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0704, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0572, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0709, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0559, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0855, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0422, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0569, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0395, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0479, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0523, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0543, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0592, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0423, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0463, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0517, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0672, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0520, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0949, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0491, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0396, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0465, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0499, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0705, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0524, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0392, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0429, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0521, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0505, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0399, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0379, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0534, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0404, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0672, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0446, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0475, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0448, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0415, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0457, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0598, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0616, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0583, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0502, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0632, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0829, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0449, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0385, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0484, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0390, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0383, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0398, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0594, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0438, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0421, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0471, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0673, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0412, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0460, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0736, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0505, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0424, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0515, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0544, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0442, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0455, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0507, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0766, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0476, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0591, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0477, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0545, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0495, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0425, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0621, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0900, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0954, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0669, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0437, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0539, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0486, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0442, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0577, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0616, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0502, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0631, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0564, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0542, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0398, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0395, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0433, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0552, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0690, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0567, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0514, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0489, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0573, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0447, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0529, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0461, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0434, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0021, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0505, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0817, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0641, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0430, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0627, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0708, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0664, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0627, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0388, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0433, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0404, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0551, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0466, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0397, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0621, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0566, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0536, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0416, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0727, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0393, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0574, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0689, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0393, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0495, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0418, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0796, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0521, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0555, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0498, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0634, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0794, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0384, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0932, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0521, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0424, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0416, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0510, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0384, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0565, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0526, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0540, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0561, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0520, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0421, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0856, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0472, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0810, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0409, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0440, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0560, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0496, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0386, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0722, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0387, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0419, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0621, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0756, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0629, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0778, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0389, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0509, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0405, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0897, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0681, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0438, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0462, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0538, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0546, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0565, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0409, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0637, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0514, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0607, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0510, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0518, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0386, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0424, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0668, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0598, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0490, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0677, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0457, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0428, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0560, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0690, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0702, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0505, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0523, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0546, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0525, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0476, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0887, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0407, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0622, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0814, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0414, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0458, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0474, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0654, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0473, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0631, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0676, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0395, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0458, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0453, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0015, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0580, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0423, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0440, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0459, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0640, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0003, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0607, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0565, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0469, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0009, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0486, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0388, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0415, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0623, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0570, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0416, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0414, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0438, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0415, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0836, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0451, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0494, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0575, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0682, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0483, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0382, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0412, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0591, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0568, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0583, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0433, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0690, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0423, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0396, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0727, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0406, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0581, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0394, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0421, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0688, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0440, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0449, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0495, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0433, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0379, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0647, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0498, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0441, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0414, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0416, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0406, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0609, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0466, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0408, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0578, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0545, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0418, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0615, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0423, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0008, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0382, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0009, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0699, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0504, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0562, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0484, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0596, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0511, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0603, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0555, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0412, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0408, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0388, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0682, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0407, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0476, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0673, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0874, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0493, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0693, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0606, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0607, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0468, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0585, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0564, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0541, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0526, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0379, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0454, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0463, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0452, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0576, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0382, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0403, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0445, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0450, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0638, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0485, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0496, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0659, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0580, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0447, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0480, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0568, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0010, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0435, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0492, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0002, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0427, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0463, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0396, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0393, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0519, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0408, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0534, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0719, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0451, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0660, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0645, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0422, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0441, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0483, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0429, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0543, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0497, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0402, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0496, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0491, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0413, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0425, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0529, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0453, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0463, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0435, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0400, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0535, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0456, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0414, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0505, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0833, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0549, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0384, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0387, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0549, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0984, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0484, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0635, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1419, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0447, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0655, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0387, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0414, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0445, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0441, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0446, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0451, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0561, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0971, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0512, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0737, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0407, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0450, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0662, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0462, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0008, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0424, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0441, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0565, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0470, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0545, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0607, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0417, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0489, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0552, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0738, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0880, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0517, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0391, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0466, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0496, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0571, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0471, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0892, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0514, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0417, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0489, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0431, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0497, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0490, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0800, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0798, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0599, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0527, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0456, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0596, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0443, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0422, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0746, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0383, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0404, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0558, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0008, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0511, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0549, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0002, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0002, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0009, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0022, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0408, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0386, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0434, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0019, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0751, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0532, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0853, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0762, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0017, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0666, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0432, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0629, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0391, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0619, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0433, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0567, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0492, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0754, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0418, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0436, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0543, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0586, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0650, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0616, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0385, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0541, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0389, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0384, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0488, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0475, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0594, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0503, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0403, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0704, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0620, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0468, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0452, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0600, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0399, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0486, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0480, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0530, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0422, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0451, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0415, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0467, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0389, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0444, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0486, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0848, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0514, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0499, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0585, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0440, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0483, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0438, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0379, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0536, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0429, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0428, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0523, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0476, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0391, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0445, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0436, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0474, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0458, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0436, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0523, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0913, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0448, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0002, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0558, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0440, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0409, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0431, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0541, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0711, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0399, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0415, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0884, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0701, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0532, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0558, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0461, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0502, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0566, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0577, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0548, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0440, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0481, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0420, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0625, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0505, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0431, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0379, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0416, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0548, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0733, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0603, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0391, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0549, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0405, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0390, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0625, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0582, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0685, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0561, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0711, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0532, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0388, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0665, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0490, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0684, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0455, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0529, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0664, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0655, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0405, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0649, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0414, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0501, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0610, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0382, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0405, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0751, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0480, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0527, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0019, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0017, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0434, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0444, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0631, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0002, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0405, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0412, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0658, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0598, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0441, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0574, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0687, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0477, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1007, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0486, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0457, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0780, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0465, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0493, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0460, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0469, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0413, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1629, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0510, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0566, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0578, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0435, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0940, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0385, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0483, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0780, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0451, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0516, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0428, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0487, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0868, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0633, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0606, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0420, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0405, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0394, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0463, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0457, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0563, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0487, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0424, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0485, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0614, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0424, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0481, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0632, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0840, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0445, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0443, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0441, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0696, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0565, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0448, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0791, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0544, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0385, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0444, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0465, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0462, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0424, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0383, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0478, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0415, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0457, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0416, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0410, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0445, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0415, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0431, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0487, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0470, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0622, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0467, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0753, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0393, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0487, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0387, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0425, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0593, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0427, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0456, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0489, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0485, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0452, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0438, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0384, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0478, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0632, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0583, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0412, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0408, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0946, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0553, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0427, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0418, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0470, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0526, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0750, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0421, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0442, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0721, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0613, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0691, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0393, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0578, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0445, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0443, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0556, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0426, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0752, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0500, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0592, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0405, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0630, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0581, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0650, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0491, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0436, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0389, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0385, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0421, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0485, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0406, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0707, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0502, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0576, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0405, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0403, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0434, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0578, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0418, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0466, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0485, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0420, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0491, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0566, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0395, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0706, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0428, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0390, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0518, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0508, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0561, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0392, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0584, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0477, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0514, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0962, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0468, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0398, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0609, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0442, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0395, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0393, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0442, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0420, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0400, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0470, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0448, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0482, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0480, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0570, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0410, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0444, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0550, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0609, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0478, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0488, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0637, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0383, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0396, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0445, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0720, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0390, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0512, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0491, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0504, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0584, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0483, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0513, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0399, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0408, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0449, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0510, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0419, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0694, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0430, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0472, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0414, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0491, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0504, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0567, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0821, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0706, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0452, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0454, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0396, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0419, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0385, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0801, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0454, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0772, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0409, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0486, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0543, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0526, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0536, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0875, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0698, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0559, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0394, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0398, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0422, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0382, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0383, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0433, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0384, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0852, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0439, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0466, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0449, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0562, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0533, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0397, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0448, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0517, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0466, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0434, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0409, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0552, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0446, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0485, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0689, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0555, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0473, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0454, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0429, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0454, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0008, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0556, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0389, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0524, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0458, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0845, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0538, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0469, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0670, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0387, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0438, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0530, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0448, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0549, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0435, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0586, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0609, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0480, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0006, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0453, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0581, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0398, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0451, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0519, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0384, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0445, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0419, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0480, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0403, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0442, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0446, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0379, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0535, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1012, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0552, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0518, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0383, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0451, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0615, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0414, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0519, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0534, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0477, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0439, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0533, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0424, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0572, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0468, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0435, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0394, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0644, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0415, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0670, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0426, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0474, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0499, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0474, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0382, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0624, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0453, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0426, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0415, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0527, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0509, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0587, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0409, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0456, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0006, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0466, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0460, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0502, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0854, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0432, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1016, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0477, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0402, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0528, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0387, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0482, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0675, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0436, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0741, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0538, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0462, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0490, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0392, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0005, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0012, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0432, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0392, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0017, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0390, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0777, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0670, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0403, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0752, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0405, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0614, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0561, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0440, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0397, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0468, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0419, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0432, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0434, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0422, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0400, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0449, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0403, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1013, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0391, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0479, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0536, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0384, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0437, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0494, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0397, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0450, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0397, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0385, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0682, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0474, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0514, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0485, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0429, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0532, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0544, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0420, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0784, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0421, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0404, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0520, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0379, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0396, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0438, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0478, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0445, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0408, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0488, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0597, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0528, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0471, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0710, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0536, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0386, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0417, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0388, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0392, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0476, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0463, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0413, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0512, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0632, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0406, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0540, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0504, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0648, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0694, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0386, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0550, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0398, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0954, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0382, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0556, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0592, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0602, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0430, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0523, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0445, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0540, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0390, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0408, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0500, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0420, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0584, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0390, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0385, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0438, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0497, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0468, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0437, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0459, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0516, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0499, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0549, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0634, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0429, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0493, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0538, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0448, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0942, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0509, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0408, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0487, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0444, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0395, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0570, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0473, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0501, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0466, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0674, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0560, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0536, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0379, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0518, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0419, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0481, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0502, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0016, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0426, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0389, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0417, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0412, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0417, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0469, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0395, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0540, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0448, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0453, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0383, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0488, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0385, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0483, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0391, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0419, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0401, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0526, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0385, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0395, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0503, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0430, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0615, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0021, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0004, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0754, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0405, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0679, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0855, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0515, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0602, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0379, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0607, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0413, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0572, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0574, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0636, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0472, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0465, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0448, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0542, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0668, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0547, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0880, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0469, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0425, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0411, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0391, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0488, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0469, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0452, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0464, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0403, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0436, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0413, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0608, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0539, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0529, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0407, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0453, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0673, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0412, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0642, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0387, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0593, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0421, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0685, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0456, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0442, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0409, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0488, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0561, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0538, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0545, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0540, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0440, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0414, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0476, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0490, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0379, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0558, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0995, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0022, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0379, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0483, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0421, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0391, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0592, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0448, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0412, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0495, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0617, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0429, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0530, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0419, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0538, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0458, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0426, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0391, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0415, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0827, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0424, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0392, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0400, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0010, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0442, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0540, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0024, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0604, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0528, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0467, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0876, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0379, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0520, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0684, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0590, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0444, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0469, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0584, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0420, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0540, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0410, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0488, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0454, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0422, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0755, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0417, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0460, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0454, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0467, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0442, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0489, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0710, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0552, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0624, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0710, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0705, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0480, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0528, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0520, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0382, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0536, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0390, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0429, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0519, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0683, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0412, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0589, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0469, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0395, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0473, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0408, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0510, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0388, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0401, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0465, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0502, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0419, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0611, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0440, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0490, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0473, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0004, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0005, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0589, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0507, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0478, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0629, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0400, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0395, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0860, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0903, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0391, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0432, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0391, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0400, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0475, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0402, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0421, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1556, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0668, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0432, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0506, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0613, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0418, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0408, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0557, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0383, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0475, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0485, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0569, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0472, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0779, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0509, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0412, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0921, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0496, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0439, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0502, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0399, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0574, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0513, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0399, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0464, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0487, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0458, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0523, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0606, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0496, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0533, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0506, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0433, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0407, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0423, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0468, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0574, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0415, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0401, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0489, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0531, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0722, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0405, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0509, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0596, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0644, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0562, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0482, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0440, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0450, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0595, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0640, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0415, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0428, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0430, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0394, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0379, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0493, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0714, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0391, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0564, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0463, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0775, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0536, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0421, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0467, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0451, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0400, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0511, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0589, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0421, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0429, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0481, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0461, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0459, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0408, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0866, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0516, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0643, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0388, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0419, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0649, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0483, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0462, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0539, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0425, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0720, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0961, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0385, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0531, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0833, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0501, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0630, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1019, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0478, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0418, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0387, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0493, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0562, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0435, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0532, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0543, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0687, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0503, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0493, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0494, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0613, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0498, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0453, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0385, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0816, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0433, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0439, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0388, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0391, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0509, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0865, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0396, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0554, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0480, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0426, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0561, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0403, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0435, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0417, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0443, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0486, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0567, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0809, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0478, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0414, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0696, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0505, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0437, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0674, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0474, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0471, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0407, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0808, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0495, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0542, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0416, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0385, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0545, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0435, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0464, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0437, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0470, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0421, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0398, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0713, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0425, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0445, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0391, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0389, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0454, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0535, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0444, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0794, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0430, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0415, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0436, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0490, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0495, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0428, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0488, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0418, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0425, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0473, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0382, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0510, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0430, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0388, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0482, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0579, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0514, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0467, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0472, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0480, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0870, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0562, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0566, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0519, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0706, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0653, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0419, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0713, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0466, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0491, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0399, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0447, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0460, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0638, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0901, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0497, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0468, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0395, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0420, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0649, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0432, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0409, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0585, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0483, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0405, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0644, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0430, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0844, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0538, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0573, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0469, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0388, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0431, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0393, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0424, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0400, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0674, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0575, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0592, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0397, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0432, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0426, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0563, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0526, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0738, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0387, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0405, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0481, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0475, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0557, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0414, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0695, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0612, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0423, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0921, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0640, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0435, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0388, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0503, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0406, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0487, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0576, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0405, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0435, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0529, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0409, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0447, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0576, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0445, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0616, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0612, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0667, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0440, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0455, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0522, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0401, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0467, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0424, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0648, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0418, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0791, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0424, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0715, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0539, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0431, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0517, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0636, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0465, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0444, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0423, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0693, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0432, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0446, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0390, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0658, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0428, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0545, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0573, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0480, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0660, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0392, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0464, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0889, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0527, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0627, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0629, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0516, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0426, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0472, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0561, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0895, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0562, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0574, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0385, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0722, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0388, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0718, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0459, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0483, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0680, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0653, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0394, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0558, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0456, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0423, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0426, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0388, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0387, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0434, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0440, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0390, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0430, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0534, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0960, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0476, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0537, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0728, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0451, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0478, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0412, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0501, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0474, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0543, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0536, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0866, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0458, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0393, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0429, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0386, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0423, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0503, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0447, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0560, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0392, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0506, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0721, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0579, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0506, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0458, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0466, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0590, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0651, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0535, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0507, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0527, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0458, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0444, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0551, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0389, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0463, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0379, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0669, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0480, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0532, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0473, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0638, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0620, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0505, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0396, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0483, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0397, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0392, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0558, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0470, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0591, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0577, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0390, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0800, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0490, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0431, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0518, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0395, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0692, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0428, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0436, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0505, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0424, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0762, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0645, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0439, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0493, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0505, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0400, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0471, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0469, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0403, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0595, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0653, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0649, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0470, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1573, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0391, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0396, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0598, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0383, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0428, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0396, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0420, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0461, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0417, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0499, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0383, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0420, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0479, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0567, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0462, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0771, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0402, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0416, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0425, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0426, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0387, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0680, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0536, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0395, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0472, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0504, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0804, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0566, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0388, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0885, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0664, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0388, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0572, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0520, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0494, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0386, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0540, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0398, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0641, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0600, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0703, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0420, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0509, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0629, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0389, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0403, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0512, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0414, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0635, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0889, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0499, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0442, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0423, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0685, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0415, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0929, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0489, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0422, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0406, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0561, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0439, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0532, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0551, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0439, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0459, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0438, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0423, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0429, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0474, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0658, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0717, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0423, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0401, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0474, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0441, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0498, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0668, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0473, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0521, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0718, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0477, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0388, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0388, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0755, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0652, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0608, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0608, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0416, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0578, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0491, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0475, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0502, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0412, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0455, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0457, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0609, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0663, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0529, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0403, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0532, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0808, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0929, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0433, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0449, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0388, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0569, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0382, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0458, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0851, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0452, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0382, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0452, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0688, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0685, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0499, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0494, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0488, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0400, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0596, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0488, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0416, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0621, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0663, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0408, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0415, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0439, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0549, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0755, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0431, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0615, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0386, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0497, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0410, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0465, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0636, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0493, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0384, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0452, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0959, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0486, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0464, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0472, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0401, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0611, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0431, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0437, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0422, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0583, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0580, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0424, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0393, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0489, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0421, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0515, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0542, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0629, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0471, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0385, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0396, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0576, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0586, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0643, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0383, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0398, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0563, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0401, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0429, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0608, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0534, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0502, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0390, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0571, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0870, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0422, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0407, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0542, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0418, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0385, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0539, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0492, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0461, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0532, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0505, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0494, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0735, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0400, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0456, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0697, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0625, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0444, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0733, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0503, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0388, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0393, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0574, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0389, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0428, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0566, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0486, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0429, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0507, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0563, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0402, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0554, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0621, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0393, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0486, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0466, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0634, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0654, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0484, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0567, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0445, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0412, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0502, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0638, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0019, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0864, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0410, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0443, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0396, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0578, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0403, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0558, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0435, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0418, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0588, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0452, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0424, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0423, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0505, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0546, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0386, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0437, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0468, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0410, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0419, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0622, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0471, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0386, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0603, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0687, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0390, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0570, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0416, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0552, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0573, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0390, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0733, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0855, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0729, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0420, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0577, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0506, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0645, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0394, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0449, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0425, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0469, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0394, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0430, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0439, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0415, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0438, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0383, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0533, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0560, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0471, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0404, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0391, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0385, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0425, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0570, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0492, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0593, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0421, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0650, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0458, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0552, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0421, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0382, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0460, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0393, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0452, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0498, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0534, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0446, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0693, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0471, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0403, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0455, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0471, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0539, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0476, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0686, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0521, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0423, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0490, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0647, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0423, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0620, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0476, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0382, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0402, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0856, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0468, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0388, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0512, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0427, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0401, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0508, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0659, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0506, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0771, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0397, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0384, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0400, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0510, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0398, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0669, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0407, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0431, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0571, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0531, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0660, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0403, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0391, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0387, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0445, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0403, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0459, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0599, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0588, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0529, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0722, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0620, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0550, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0389, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0529, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0423, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0673, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0525, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0662, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0429, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0627, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0423, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0605, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0562, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0554, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0610, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0450, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0554, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0410, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0769, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0391, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0457, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0741, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0575, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0465, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0444, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0640, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0806, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0393, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0756, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0453, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0397, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0419, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0410, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0500, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0793, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0460, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0949, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0432, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0395, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0579, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0536, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0404, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0383, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0504, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0464, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0447, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0556, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0399, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0649, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0563, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0774, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0580, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0913, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0680, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0476, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0717, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0470, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0486, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0639, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0397, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0660, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0518, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0414, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0485, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0458, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0463, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0424, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0651, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0554, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0494, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0727, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0383, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0589, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0613, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0705, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0601, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0386, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0495, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0396, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0386, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0474, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0525, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0386, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0523, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0482, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0463, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0493, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0553, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0464, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0639, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0407, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0511, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0522, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0419, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0481, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0530, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0416, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0470, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0463, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0488, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0534, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0517, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0470, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0480, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0621, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0427, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0531, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0389, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0398, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0575, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0748, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0623, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0579, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0543, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0491, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0852, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0445, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0392, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0519, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0392, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0469, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0838, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0624, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0483, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0457, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0893, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0488, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0573, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0443, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0564, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0513, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0397, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0610, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0693, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0412, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0524, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0767, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0506, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0698, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0629, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0444, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0513, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0420, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0399, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0428, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0425, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0766, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0555, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0553, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0535, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0520, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0600, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0757, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0549, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0717, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0596, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0525, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0614, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0629, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0422, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0496, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0493, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0487, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0527, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0487, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0500, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0480, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0512, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0497, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0462, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0518, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0776, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0666, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0454, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0595, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0430, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0516, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0430, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0503, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0383, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0800, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0456, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0405, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0417, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0529, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0553, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0476, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0462, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0481, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0895, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0432, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0682, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0432, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0506, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0434, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0705, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0535, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0448, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0390, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0432, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0469, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0620, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0512, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0426, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0487, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0478, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0737, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0446, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0424, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0525, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0605, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0697, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0497, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0746, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0463, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0446, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0385, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0388, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0404, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0699, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0427, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0502, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0516, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0430, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0551, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0397, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0515, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0443, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0445, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0499, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0452, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0393, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0487, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0459, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0409, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0394, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0687, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0530, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0493, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0517, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0429, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0644, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0398, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0523, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0511, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0448, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0798, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0598, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0404, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0410, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0388, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0466, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0382, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0424, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0544, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0588, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0486, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0388, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0550, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0467, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0485, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0514, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0518, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0430, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0447, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0758, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0721, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0637, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0504, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0438, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0434, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0409, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0408, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0714, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0477, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0423, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0465, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0480, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0488, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0525, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0458, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0509, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0393, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0389, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0446, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0485, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0435, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0496, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0687, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0503, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0556, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0479, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0481, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0521, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0616, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0445, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0708, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0432, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0737, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0639, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0646, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0600, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0465, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0409, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0526, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0415, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0424, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0403, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0437, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0663, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0503, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0706, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0480, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0438, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0442, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0450, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0471, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0534, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0569, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0543, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0478, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0412, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0403, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0394, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0533, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0440, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0481, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0394, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0504, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0418, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0462, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0387, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0454, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0460, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0480, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0461, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0398, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0524, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0477, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0441, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0548, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0621, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0411, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0440, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0590, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0410, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0598, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0639, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0562, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0391, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0449, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0425, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0502, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0437, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0649, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0393, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0424, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0484, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0415, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0636, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0596, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0405, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0419, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0487, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0512, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0539, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0513, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0474, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0589, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0858, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0654, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0461, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0543, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0575, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0394, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0455, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0390, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0438, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0429, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0477, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0450, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0642, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0432, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0407, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0436, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0662, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0425, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0692, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0418, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0570, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0451, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0423, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0384, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0545, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0425, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0468, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0384, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0620, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0682, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0404, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0435, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0386, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0412, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0473, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0769, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0617, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0520, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0536, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0576, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0389, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0410, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0389, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0524, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0461, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0629, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0613, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0442, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0392, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0611, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0658, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0522, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0884, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0379, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0667, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0541, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0407, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0515, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0593, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0630, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0386, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0981, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0534, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0405, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0638, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0406, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0465, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0394, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0448, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0604, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0403, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0760, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0483, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0412, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0496, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0388, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0568, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0379, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0405, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0804, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0540, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0449, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0652, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0616, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0480, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0430, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0436, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0430, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0518, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0429, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0571, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0461, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0444, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0459, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0761, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0511, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0476, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0403, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0533, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0476, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0454, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0492, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0456, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0610, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0482, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0419, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0383, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0460, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0523, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0775, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0405, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0448, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0469, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0641, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0423, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0768, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0524, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0603, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0421, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0505, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0411, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0390, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0479, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0524, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0405, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0387, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0510, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0485, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0470, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0389, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0447, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0481, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0637, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0415, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0420, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0500, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0821, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0626, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0423, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0501, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0440, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0451, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0386, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0391, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0543, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0466, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0395, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0606, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0535, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0594, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0432, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0468, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0641, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0764, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0474, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0648, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0410, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0563, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0866, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0486, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0405, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0421, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0475, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0406, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0402, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0411, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0472, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0440, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0447, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0484, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0498, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0424, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0379, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0633, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0720, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0639, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0416, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0423, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0547, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0664, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0624, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0439, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0415, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0529, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0455, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0419, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0438, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0580, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0678, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0427, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0416, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0441, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0635, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0580, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0487, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0572, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0489, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0526, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0514, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0675, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0398, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0678, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0382, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0436, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0485, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0523, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0730, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0522, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0543, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0487, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0421, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0385, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0484, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0379, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0457, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0452, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0604, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0576, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0483, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0419, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0665, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0434, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0412, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0799, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0516, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0567, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0463, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0632, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0565, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0467, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0557, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0451, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0417, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0583, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0484, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0413, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0426, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0382, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0389, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0468, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0422, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0379, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0719, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0379, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0479, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0494, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0661, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0407, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0441, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0508, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0521, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0453, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0461, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0590, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0510, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0478, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0411, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0437, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0382, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0393, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0600, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0407, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0541, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0575, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0395, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0571, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0424, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0446, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0600, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0493, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0667, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0409, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0545, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0397, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0388, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0403, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0481, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0408, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0690, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0461, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0647, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0479, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0470, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0398, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0435, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0473, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0499, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0475, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0407, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0432, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0407, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0598, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0448, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0407, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0538, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0428, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0608, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0705, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0461, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0511, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0472, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0426, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0471, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0407, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0386, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0715, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0427, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0444, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0482, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0425, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0721, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0394, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0589, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0461, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0445, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0593, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0519, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0442, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0694, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0490, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0422, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0434, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0486, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0643, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0480, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0591, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0405, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0532, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0449, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0582, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0383, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0460, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0387, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0549, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0602, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0428, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0423, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0536, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0406, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0462, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0675, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0558, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0567, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0540, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0758, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0424, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0455, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0423, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0635, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0399, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0607, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0409, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0451, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0718, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0445, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0430, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0518, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0561, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0572, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0415, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0445, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0442, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0556, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0492, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0415, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0402, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0483, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0549, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0583, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0507, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0468, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0507, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0433, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0501, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0430, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0410, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0433, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0490, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0499, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0414, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0586, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0555, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0596, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0392, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0420, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0648, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0552, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0421, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0384, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0430, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0388, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0611, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0408, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0519, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0387, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0549, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0551, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0388, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0408, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0560, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0605, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0444, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0546, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0431, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0685, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0410, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0386, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0676, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0576, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0560, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0429, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0467, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0401, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0499, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0391, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0447, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0457, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0455, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0769, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0608, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0704, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0384, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0550, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0523, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0402, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0487, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0479, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0771, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0472, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0684, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0612, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0517, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0402, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0469, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0412, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0482, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0384, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0400, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0598, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0468, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0450, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0498, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0473, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0489, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0469, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0678, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0742, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0510, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0524, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0422, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0460, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0444, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0630, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0393, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0437, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0443, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0398, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0798, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0491, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0797, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0558, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0422, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0480, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0886, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0571, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0379, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0548, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0443, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0449, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0454, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0383, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0403, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0441, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0467, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0404, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0595, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0698, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0570, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0416, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0415, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0485, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0549, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0426, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0488, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0419, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0414, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0517, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0389, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0379, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0387, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0455, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0435, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0774, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0543, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0695, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0566, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0423, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0437, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0457, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0411, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0416, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0429, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0456, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0480, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0421, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0491, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0530, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0415, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0733, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0853, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0581, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0396, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0492, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0390, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0527, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0384, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0539, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0499, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0428, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0391, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0497, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0496, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0425, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0537, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0400, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0431, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0572, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0482, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0446, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0397, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0453, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0393, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0554, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0454, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0487, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0442, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0659, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0473, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0460, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0400, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0392, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0427, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0471, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0405, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0555, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0573, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0564, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0654, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0493, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0472, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0644, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0712, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0429, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0755, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0411, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0414, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0442, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0475, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0391, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0553, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0511, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0420, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0398, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0384, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0395, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0513, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0593, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0422, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0424, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0450, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0442, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0500, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0530, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0410, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0459, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0680, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0401, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0477, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0516, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0404, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0385, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0458, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0497, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0697, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0489, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0646, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0478, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0698, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0608, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0418, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0459, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0474, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0541, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0434, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0412, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0010, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0464, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0529, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0401, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0387, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0394, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0493, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0516, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0452, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0449, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0502, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0400, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0501, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0432, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0463, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0400, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0643, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0551, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0457, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0533, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0554, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0390, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0402, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0452, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0676, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0580, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0587, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0481, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0636, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0385, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0394, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0388, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0428, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0854, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0402, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0643, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0442, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0461, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0469, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0800, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0457, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0962, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0490, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0626, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0387, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0582, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0391, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0013, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0475, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0432, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0423, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0390, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0573, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0397, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0453, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0390, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0434, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0479, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0400, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0387, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0507, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0458, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0571, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0683, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1446, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0521, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0451, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0410, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0744, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0984, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0424, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0489, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0392, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0428, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0453, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0516, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0441, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0599, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0526, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0411, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0538, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0737, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0504, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0523, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0526, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0418, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0440, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0427, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0474, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0674, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0748, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0830, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0497, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0481, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0437, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0554, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0426, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0382, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0582, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0409, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0394, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0386, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0479, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0551, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0858, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0436, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0693, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0528, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0774, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0585, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0493, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0401, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0751, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0391, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0417, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0470, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0687, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0407, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0440, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0504, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0603, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0713, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0520, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0500, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0577, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0610, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0403, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0693, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0503, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0452, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0434, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0553, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0672, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0474, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0421, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0489, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0493, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0676, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0508, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0505, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0554, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0603, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0382, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0573, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0729, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0522, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0663, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0424, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0581, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0697, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0881, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0413, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0379, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0439, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0522, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0459, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0500, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0453, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0539, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0393, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0405, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0411, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0458, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0503, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0527, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0615, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0441, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0419, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0596, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0633, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0385, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0640, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0485, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0423, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0882, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0540, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0432, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0392, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0379, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0607, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0500, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0388, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0738, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0471, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0424, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0422, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0437, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0577, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0383, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0383, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0460, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0698, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0594, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0447, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0401, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0477, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0468, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0481, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0454, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0776, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0407, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0379, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0444, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0451, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0497, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0443, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0434, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0717, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0423, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0606, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0399, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0453, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0423, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0495, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0774, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0596, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0522, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0435, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0521, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0756, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0379, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0445, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0421, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0385, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0529, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0503, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0803, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0436, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0511, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0547, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0610, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0448, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0407, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0423, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0456, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0436, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0541, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0023, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0564, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0384, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0452, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0481, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0404, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0583, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0398, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0566, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0544, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0547, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0493, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0481, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0651, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0404, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0563, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0506, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0502, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0495, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0499, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0436, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0634, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0467, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0424, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0503, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0513, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0601, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0398, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0683, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0391, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0478, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0400, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0386, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0430, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0398, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0649, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0394, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0401, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0379, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0485, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0439, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0465, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0451, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0525, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0469, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0403, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0565, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0418, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0679, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0418, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0414, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0428, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0422, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0411, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0551, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0404, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0450, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0389, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0528, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0500, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0429, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0388, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0500, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0392, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0764, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0503, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0491, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0875, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0661, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0434, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0384, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0497, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0403, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0726, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0538, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0546, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0631, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0589, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0568, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0576, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0537, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0889, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0538, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0640, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0527, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0426, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0407, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0596, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0697, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0430, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0431, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0698, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0532, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0670, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0455, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0390, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0604, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0469, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0503, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0431, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0478, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0427, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0487, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0438, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0017, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0436, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0442, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0504, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0499, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0444, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0442, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0474, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0510, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0625, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0449, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0881, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0430, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0594, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0512, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0383, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0618, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0561, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0419, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0528, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0801, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0388, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0429, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0549, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0384, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0403, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0618, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0384, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0473, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0413, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0403, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0569, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0745, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0385, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0541, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0472, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0410, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0536, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0390, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0389, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0703, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0463, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0425, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0419, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0437, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0488, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0821, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0515, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0507, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0635, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0559, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0673, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0525, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0389, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0527, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0462, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0533, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0444, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0531, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0400, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0396, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0575, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0406, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0414, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0435, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0551, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0672, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0585, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0657, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0448, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0478, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0458, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0398, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0536, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0493, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0424, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0536, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0382, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0591, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0413, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0472, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0486, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0421, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0587, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0627, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0388, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0499, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0400, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0674, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0584, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0846, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0513, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0707, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0420, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0638, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0501, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0435, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0534, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0387, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0469, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0829, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0526, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0470, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0798, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0394, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0399, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0409, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0499, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0475, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0535, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0503, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0624, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0413, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0450, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0712, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0460, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0393, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0445, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0475, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0639, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0405, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0595, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0417, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0413, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0498, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0387, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0392, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0401, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0728, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0513, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0445, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0446, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0761, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0403, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0517, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0916, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0515, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0448, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0436, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0398, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0429, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0499, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0406, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0447, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0505, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0587, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0406, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0880, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0575, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0454, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0417, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0557, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0396, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0446, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0483, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0664, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0402, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0416, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0388, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0476, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0509, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0451, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0456, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0586, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0394, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0379, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0720, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0477, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0506, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0450, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0489, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0492, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0474, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1008, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0391, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0430, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0627, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0405, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0399, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0515, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0470, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0593, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0414, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0578, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0407, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0462, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0685, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0395, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0723, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0725, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0417, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0431, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0402, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0479, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0521, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0657, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0610, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0560, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0628, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0413, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0504, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0427, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0526, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0438, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0575, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0382, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0390, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0513, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0545, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0455, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0588, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0629, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0527, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0616, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0457, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0455, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0435, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0388, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0553, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0386, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0697, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0506, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0649, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0485, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0527, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0421, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0716, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0440, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0584, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0407, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0584, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0570, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0666, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0447, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0499, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0500, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0457, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0410, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0548, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0869, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0533, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0590, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0524, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0562, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0582, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0488, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0604, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0460, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0484, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0399, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0524, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0561, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0386, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0434, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0451, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0522, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0633, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0654, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0510, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0396, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0387, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0455, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0666, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0392, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0412, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0393, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0395, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0435, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0423, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0460, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0405, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0401, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0551, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0409, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0420, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0410, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0597, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0499, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0436, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0590, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0681, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0409, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0394, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0474, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0470, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0503, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0484, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0531, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0463, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0565, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0539, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0489, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0413, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0405, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0394, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0455, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0462, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0400, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0414, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0547, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0577, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0484, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0410, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0456, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0619, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0428, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0489, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0421, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0603, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0644, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0817, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0739, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0415, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0465, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0546, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0386, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0424, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0386, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0529, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0404, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0633, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0508, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0409, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0557, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0492, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0471, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0454, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0610, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0409, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0575, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0428, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0446, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0621, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0426, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0570, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0436, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0555, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0456, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0454, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0491, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0398, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0427, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0527, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0393, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0419, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0618, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0483, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0392, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0424, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0616, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0477, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0433, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0379, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0385, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0390, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0405, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0547, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0405, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0471, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0689, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0506, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0429, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0589, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0396, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0398, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0467, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0383, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0392, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0420, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0394, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0401, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0399, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0533, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0733, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0439, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0477, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0648, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0392, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0415, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0397, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0779, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0575, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0624, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0522, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0657, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0451, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0433, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0457, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0560, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0485, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0764, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0813, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0604, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0412, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0628, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0478, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0512, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0402, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0478, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0421, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0638, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0591, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0379, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0418, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0517, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0389, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0521, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0504, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0831, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0634, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0478, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0687, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0410, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0393, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0408, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0445, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0401, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0600, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0411, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0539, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0462, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0636, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0509, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0387, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0461, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0424, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0385, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0493, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0496, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0418, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0556, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0459, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0379, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0458, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0384, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0388, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0710, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0524, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0606, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0636, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0697, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0468, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0393, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0424, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0569, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0394, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0403, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0384, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0639, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0379, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0579, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0534, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0441, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0474, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0552, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0500, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0847, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0470, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0507, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0400, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0448, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0383, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0727, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0397, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0435, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0526, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0451, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0461, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0404, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0700, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0491, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0407, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0425, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0401, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0415, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0512, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0462, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0541, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0546, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0706, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0459, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0424, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0484, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0812, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0409, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0579, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0659, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0655, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0383, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0457, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0408, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0534, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0422, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0409, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0457, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0445, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0744, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0917, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0468, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0388, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0427, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0382, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0505, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0395, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0450, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0668, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0400, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0478, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0593, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0474, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0543, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0443, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0475, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0545, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0392, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1014, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0723, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0379, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0573, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0503, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0484, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0670, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0025, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0417, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0402, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0525, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0673, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0388, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0651, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0731, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0789, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0696, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0446, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0554, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0467, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0520, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0488, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0400, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0414, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0518, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0671, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0425, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0419, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0019, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0390, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0006, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0473, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0514, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0670, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0423, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0493, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0386, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0383, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0420, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0425, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0637, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0428, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0438, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0416, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0417, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0403, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0508, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0661, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0410, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0612, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0433, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0393, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0417, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0411, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0490, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0387, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0498, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0389, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0570, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0517, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0559, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0475, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0625, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0416, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0588, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0390, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0638, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0775, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0385, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0427, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0444, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0514, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0489, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0430, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0445, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0414, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0520, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0386, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0596, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0434, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0386, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0411, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0546, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0650, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0523, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0400, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0555, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0390, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0427, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0740, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0397, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0655, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0840, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0874, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0410, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0440, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0567, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0546, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0471, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0017, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0419, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0555, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0796, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0498, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0390, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0585, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0557, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0462, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0391, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0391, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0457, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0492, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0633, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0567, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0479, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0541, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0424, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0645, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0404, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0428, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0565, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0532, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0621, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0792, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0871, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0569, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0477, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0396, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0584, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0530, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0537, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0452, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0501, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0497, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0765, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0697, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0573, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0403, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0401, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0868, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0390, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0723, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0419, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0421, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0415, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0669, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0486, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0387, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0449, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0644, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0518, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0641, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0461, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0584, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0749, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0724, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0503, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0523, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0474, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0474, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0707, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0542, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0442, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0398, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0516, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0602, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0483, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0426, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0418, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0511, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0636, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0388, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0395, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0431, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0468, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0632, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0571, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0386, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0396, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0393, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0468, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0436, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0413, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0410, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0650, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0379, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0481, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0590, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0711, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0534, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0504, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0414, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0470, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0517, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0418, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0441, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0427, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0973, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0642, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0466, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0396, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0464, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0516, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0517, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0499, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0529, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0783, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0408, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0389, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0509, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0493, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0512, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0387, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0418, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0678, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0465, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0452, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0568, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0762, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0517, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0419, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0489, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0587, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0548, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0416, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0474, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0492, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0656, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0427, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0495, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0383, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0391, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0548, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0481, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0497, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0531, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0383, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0431, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0510, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0410, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0382, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0877, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0473, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0574, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0489, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0544, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0473, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0456, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0392, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0395, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0471, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0379, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0422, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0469, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0583, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0486, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0413, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0421, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0491, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0386, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0479, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0402, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0002, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0411, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0500, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0397, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0931, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0561, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0507, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0648, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0382, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0435, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0409, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0436, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0432, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0427, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0458, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0503, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0944, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0383, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0415, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0648, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0611, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0583, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0516, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0454, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0555, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0445, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0408, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0463, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0606, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0557, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0393, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0469, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0462, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0410, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0414, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0611, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0443, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0413, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0382, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0386, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0597, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0722, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0401, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0653, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0438, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0565, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0395, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0420, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0479, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0597, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0384, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0487, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0479, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0629, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0612, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0382, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0396, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0404, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0430, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0430, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0498, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0510, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0721, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0448, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0474, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0477, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0459, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0411, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0392, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0553, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0493, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0551, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0396, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0477, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0910, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0485, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0485, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0467, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0412, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0478, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0424, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0432, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0460, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0421, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0401, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0865, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0603, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0400, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0600, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0405, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0766, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0536, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0506, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0529, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0551, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0575, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0428, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0511, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0789, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0423, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0571, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0419, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0505, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0414, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0756, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0405, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0510, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0383, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0542, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0442, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0631, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0424, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0382, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0388, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0647, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0532, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0617, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0478, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0502, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0397, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0483, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0422, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0544, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0749, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0408, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0553, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0454, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0397, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0512, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0407, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0442, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0412, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0452, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0417, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0382, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0395, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0702, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0553, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0421, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0454, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0021, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0004, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0015, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0671, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0672, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0469, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0725, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0589, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0007, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0626, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0737, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0441, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0590, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0491, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0500, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0467, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0454, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0428, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0464, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0454, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0428, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0437, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0421, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0445, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0851, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0619, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0383, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0431, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0534, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0608, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0592, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0615, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0461, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0441, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0384, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0583, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0516, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0490, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0431, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0384, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0383, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0416, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0527, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0434, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0009, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0497, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0497, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0518, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0487, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0632, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0444, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0595, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0400, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0382, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0433, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0612, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0650, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0562, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0420, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0431, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0431, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0476, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0644, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0435, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0387, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0470, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0483, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0386, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0568, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0548, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0406, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0613, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0435, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0588, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0478, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0431, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0379, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0383, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0565, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0506, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0523, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0388, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0417, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0389, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0465, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0415, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0002, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0493, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0454, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0414, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0389, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0428, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0391, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0495, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0414, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0523, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0436, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0574, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0456, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0414, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0428, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0549, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0552, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0422, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0485, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0488, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0432, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0457, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0649, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0441, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0409, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0446, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0397, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0438, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0419, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0449, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0409, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0529, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0427, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0472, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0774, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0460, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0415, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0427, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0440, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0418, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0398, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0573, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0526, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0385, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0440, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0505, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0434, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0397, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0472, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0413, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0393, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0433, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0399, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0708, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0406, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0468, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0473, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0534, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0425, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0613, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0770, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0697, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0002, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0003, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0007, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0001, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0717, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0606, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0629, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0423, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0641, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0385, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0558, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0621, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0421, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0484, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0390, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0384, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0783, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0006, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0567, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0554, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0478, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0669, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0642, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0407, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0472, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0521, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0524, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0393, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0475, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0394, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0473, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0476, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0475, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0874, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0640, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0458, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0402, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0455, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0425, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0511, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0736, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0558, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0430, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0394, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0639, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0506, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0444, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0491, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0647, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0591, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0656, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0514, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0386, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0423, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0462, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0622, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0390, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0404, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0409, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0456, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0779, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0439, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0398, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0421, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0437, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0435, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0431, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0622, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0472, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0510, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0428, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0490, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0636, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0430, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0393, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0432, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0575, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0604, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0692, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0600, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0526, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0625, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0459, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0416, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0584, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0386, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0569, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0496, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0384, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0462, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0407, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0672, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0395, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0506, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0421, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0588, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0485, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0535, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0395, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0554, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0541, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0401, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0452, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0491, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0436, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0539, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0585, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0488, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0392, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0445, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0500, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0493, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0416, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0771, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0460, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0450, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0501, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0412, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0426, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0619, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0540, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0527, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0404, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0384, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0440, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0458, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0560, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0489, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0420, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0562, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0872, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0656, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0396, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0765, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0477, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0645, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0004, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0598, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0004, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0002, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0390, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0387, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0525, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0548, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0452, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0391, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0455, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0452, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0720, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0565, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0417, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0790, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0407, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0782, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0474, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0689, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0494, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0382, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0426, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0395, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0403, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0471, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0385, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0536, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0418, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0388, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0603, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0393, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0398, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0388, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0406, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0403, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0944, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0637, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0547, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0694, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0388, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0476, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0483, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0397, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0476, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0412, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0487, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0478, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0403, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0397, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0399, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0008, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0023, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0024, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0001, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0495, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0010, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0692, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0591, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0011, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0388, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0909, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0002, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0429, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0423, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0768, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0023, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0411, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0619, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0009, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1024, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0425, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0512, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0430, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0472, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0459, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0569, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0435, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0513, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0634, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0745, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0552, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0447, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0496, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0551, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0730, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0411, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0383, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0471, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0874, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0497, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0482, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0001, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0650, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0476, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0767, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0503, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0403, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0398, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0440, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0388, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0450, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0399, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0680, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0612, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0501, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0434, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0488, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0384, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0494, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0384, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0512, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0396, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0391, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0688, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0390, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0547, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0409, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0575, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0400, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0412, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0523, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0508, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0450, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0647, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0382, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0415, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0708, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0391, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0436, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0398, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0403, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0391, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0936, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0681, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0388, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0431, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0389, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0499, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0382, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0391, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0385, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0748, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0403, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0541, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0774, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0490, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0379, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0440, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0432, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0390, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0754, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0460, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0406, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0513, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0754, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0456, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0696, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0414, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0738, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0438, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0011, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0006, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0399, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0484, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0017, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(9.1948e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0521, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0480, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0391, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0006, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(9.7327e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0017, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0510, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0001, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0012, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0011, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0001, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0472, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0438, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0004, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0017, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0016, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0012, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(8.7298e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0019, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0015, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0002, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0509, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0019, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0506, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0017, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0002, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(8.8427e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(8.8427e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(8.8018e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0562, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0500, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0433, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0451, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0839, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0391, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0506, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0694, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0428, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0418, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0406, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0451, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0422, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0428, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0469, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(9.3612e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0497, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0585, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0644, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0496, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0874, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0550, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0461, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0461, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0541, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0727, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0723, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0443, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0475, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0918, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0521, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0436, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0555, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0473, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0430, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0497, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0594, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0756, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0463, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0457, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0528, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0396, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0615, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0732, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0424, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0620, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0662, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0621, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0525, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0586, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0581, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0388, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0583, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0434, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0379, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0489, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0542, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0499, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0607, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0641, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0461, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0438, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0632, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0427, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0432, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0515, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0504, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0418, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0410, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0450, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0544, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(7.4248e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0387, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0017, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0395, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0006, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0001, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0002, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0588, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0012, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0004, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(9.8586e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0012, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0024, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0015, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0018, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(8.1387e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0447, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0001, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0020, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0016, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0017, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0013, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0002, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(7.5688e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0002, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0002, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0006, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0552, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0009, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0025, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0385, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0001, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(9.5611e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0459, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(6.5896e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0006, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(7.1065e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(7.4838e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0455, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(7.6261e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0001, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0018, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0007, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(6.8726e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0432, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0452, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0386, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0695, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0405, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0456, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0539, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0586, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0389, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0931, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0413, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0439, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0400, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0430, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0524, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0009, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0578, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0639, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0433, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0578, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0405, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0406, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0470, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0445, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0398, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0422, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0431, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0416, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0446, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0391, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0387, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0779, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0406, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0482, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0547, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0477, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0693, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0404, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0931, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0412, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0511, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0566, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0469, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0402, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0387, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0607, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0453, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0861, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0398, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0400, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0487, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0394, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0539, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0384, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0493, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0549, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0429, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0512, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0393, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0468, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0396, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0003, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0498, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0446, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0621, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0450, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0425, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0001, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0021, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0019, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0593, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0008, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0398, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0009, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0015, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0516, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0014, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0393, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0829, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0437, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0012, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0610, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0011, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0477, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0527, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0436, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0003, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0428, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0023, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0390, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0014, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0006, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0436, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0015, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0433, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0702, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0383, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0456, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0623, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0398, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0398, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0403, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0488, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0382, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0429, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0522, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0445, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0561, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0023, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0003, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0385, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0593, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0459, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0449, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0474, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0387, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0623, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0433, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0650, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0597, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0443, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0453, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0453, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0455, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0479, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0382, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0529, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0469, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0407, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0012, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0422, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0497, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0437, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0387, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0585, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0437, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0467, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0713, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0488, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0566, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0401, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0591, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0613, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0725, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0004, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0014, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0001, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0009, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0005, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0004, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0389, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(4.8868e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0022, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0445, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0003, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0514, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0445, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0013, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0017, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0722, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0482, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0750, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0544, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0500, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0452, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0559, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0478, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0013, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0410, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0434, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0388, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0478, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0451, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0456, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0478, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0524, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0379, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0438, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0782, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0516, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0445, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0516, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0396, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0528, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0477, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0424, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0604, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0387, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0475, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0473, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0394, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0382, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0434, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0398, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0532, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0402, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0446, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0677, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0383, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0426, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0548, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0414, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0439, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0623, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0406, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0421, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0809, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0437, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0396, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0537, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0410, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0444, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0532, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0551, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0621, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0405, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0522, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0441, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0382, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0391, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0698, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0502, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0401, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0844, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0441, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0421, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0492, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0571, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0729, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0448, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0393, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0484, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0730, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0425, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0420, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0667, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0400, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0022, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0454, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0024, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0441, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0495, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0399, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0590, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0696, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0524, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0567, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0417, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0398, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0490, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0473, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0483, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0592, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0466, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0425, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0389, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0409, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0603, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0493, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0385, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0386, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0474, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0447, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0435, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0407, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0422, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0459, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0559, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0479, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0521, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0451, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0413, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0413, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0610, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0389, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0745, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0836, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0390, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0383, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0615, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0439, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0660, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0459, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0437, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0449, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0400, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0433, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0933, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0451, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0576, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0624, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0398, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0441, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0394, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0497, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0489, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0463, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0437, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0514, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0387, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0461, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0391, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0398, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0459, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0419, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0436, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0409, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0986, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0409, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0416, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0550, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0398, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0384, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0467, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0506, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0590, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0450, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0472, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0405, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0385, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0400, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0686, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0487, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0428, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0421, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0505, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0531, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0483, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0667, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0602, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0522, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0542, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0643, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0379, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0490, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0403, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0595, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0417, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0508, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0393, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0423, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0426, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0422, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0969, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0420, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0777, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0527, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0560, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0549, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0534, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0578, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0477, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0409, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0708, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0399, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0392, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0382, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0439, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0725, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0409, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0474, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0671, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0526, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0544, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0467, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0560, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0490, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0631, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0832, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0401, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0656, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0825, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0442, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0697, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0475, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0438, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0493, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0468, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0404, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0473, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0774, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0703, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0604, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0466, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0384, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0448, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0502, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0444, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0402, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0410, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0596, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0503, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0786, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0463, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0489, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0443, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0540, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0404, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0556, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0447, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0604, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0443, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0544, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0523, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0409, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0570, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0562, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0393, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0482, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0545, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0441, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0472, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0427, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0655, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0655, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0536, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0442, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0477, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0886, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0479, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0427, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0562, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0470, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0688, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0412, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0454, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0019, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0628, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0435, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0417, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0473, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0490, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0497, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0497, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0389, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0526, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0523, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0419, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0680, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0666, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0508, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0467, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0482, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0633, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0536, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0416, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0386, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0447, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0437, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0476, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0455, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0537, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0463, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0468, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0452, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0570, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0429, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0456, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0402, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0416, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0463, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0469, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0474, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0399, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0567, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0485, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0745, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0402, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0482, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0472, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0402, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0454, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0395, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0627, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0451, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0395, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0487, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0417, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0696, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0496, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0573, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0408, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0412, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0405, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0605, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0593, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0382, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0568, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0777, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0440, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0561, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0481, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0392, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0485, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0634, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0477, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0469, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0552, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0471, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0410, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0447, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0418, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0443, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0649, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0401, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0849, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0439, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0497, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0570, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0540, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0679, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0512, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0400, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0639, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0450, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0447, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0655, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0572, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0518, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0556, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0433, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0445, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0662, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0444, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0516, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0443, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0549, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0451, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0507, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0545, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0619, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0408, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0483, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0446, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0455, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0437, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0402, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0485, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0391, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0606, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0445, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0532, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0476, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0670, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0526, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0606, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0430, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0525, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0474, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0441, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0618, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0552, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0733, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0582, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0500, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0518, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0837, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0405, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0487, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0402, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0407, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0507, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0751, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0437, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0420, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0392, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0508, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0480, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0436, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0475, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0509, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0523, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0444, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0455, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0483, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0620, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0440, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0409, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0586, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0465, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0635, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0383, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0590, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0625, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0587, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0496, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0433, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0479, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0534, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0401, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0419, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0445, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0433, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0414, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0486, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0550, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0582, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0424, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0390, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0413, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0637, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0431, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0629, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0606, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0453, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0396, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0395, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0472, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0522, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0636, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0447, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0440, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0454, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0407, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0473, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0425, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0456, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0396, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0566, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0736, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0383, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0563, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0444, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0402, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0491, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0560, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0601, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0516, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0640, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0415, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0445, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0469, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0661, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0457, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0637, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0806, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0551, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0442, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0427, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0426, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0419, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0845, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0383, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0444, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0468, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0584, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0634, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0685, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0410, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0389, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0538, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0426, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0536, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0550, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0403, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0682, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0431, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0410, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0446, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0682, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0473, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0429, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0456, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0401, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0570, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0560, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0490, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0906, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0506, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0385, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0562, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0489, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0513, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0516, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0437, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0458, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0401, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0727, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0589, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0762, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0737, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0691, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0673, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0586, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0475, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0402, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0379, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0488, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0923, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0508, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0547, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0580, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0455, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0379, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0493, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0385, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0498, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0458, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0758, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0767, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0394, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0411, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0760, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0408, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0412, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0427, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0414, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0450, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0404, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0386, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0412, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0416, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0443, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0536, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0766, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0985, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0438, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0432, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0391, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0694, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0544, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0441, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0386, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0467, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0575, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0474, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0614, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0407, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0486, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0444, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0480, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0633, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0446, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0626, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0400, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0497, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0574, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0627, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0533, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0480, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0400, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0666, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0500, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0423, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0548, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0420, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0422, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0426, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0500, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0400, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0494, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0455, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0494, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0386, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0585, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0410, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0540, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0459, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0396, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0414, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0666, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0536, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0543, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0387, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0746, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0562, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0500, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0614, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0607, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0748, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0810, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0696, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0551, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0529, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0621, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0494, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0517, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0444, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0712, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0619, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0478, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0422, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0430, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0571, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0391, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0424, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0671, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0619, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0473, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0910, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0490, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0499, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0503, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0435, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0432, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0383, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0493, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0448, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0397, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0457, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0490, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0442, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0640, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0440, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0479, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0424, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0489, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0658, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0500, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0812, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0584, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0691, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0382, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0592, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0433, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0520, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0427, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0570, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0546, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0481, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0445, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0540, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0597, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0519, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0441, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0603, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0487, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0422, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0768, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0457, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0405, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0666, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0442, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0636, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0495, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0390, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0456, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0472, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0498, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0434, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0404, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0580, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0576, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0563, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0385, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0676, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0408, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0475, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0488, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0557, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0382, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0431, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0597, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0456, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0488, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0653, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0434, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0435, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0435, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0454, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0393, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0612, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0384, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0454, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0512, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0548, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0385, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0794, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0434, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0690, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0479, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0414, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0677, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0472, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0484, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0594, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0418, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0474, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0405, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0013, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0694, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0023, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0432, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0409, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0413, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0418, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0554, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0461, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0379, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0501, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0650, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0554, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0402, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0472, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0399, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0517, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0461, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0395, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0385, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0467, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0413, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0572, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0678, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0476, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0481, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0429, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0397, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0393, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0473, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0509, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0562, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0740, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0669, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0526, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0430, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0386, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0382, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0492, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0426, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0526, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0441, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0814, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0391, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0628, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0468, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0509, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0388, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0546, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0399, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0414, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0474, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0410, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0442, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0454, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0487, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0603, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0487, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0542, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0547, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0474, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0417, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1000, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0543, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0400, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0474, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0644, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0400, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0502, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0499, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0397, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0390, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1543, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0691, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0804, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0019, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0590, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0431, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0494, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0545, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0025, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0430, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0781, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0751, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0413, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0400, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0443, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0392, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0416, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0401, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0404, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0467, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0398, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0427, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0400, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0441, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0388, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0527, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0386, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0451, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0526, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0599, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0383, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0446, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0566, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0410, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0413, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0531, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0379, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0432, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0516, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0558, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0442, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0398, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0408, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0401, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0456, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0506, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0397, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0403, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0486, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0530, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0512, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0424, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0388, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0628, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0414, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0507, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0389, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0558, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0526, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0506, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0382, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0384, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0418, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0432, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0674, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0477, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0418, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0384, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0426, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0527, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0411, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0389, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0395, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0515, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0383, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0504, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0544, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0410, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0388, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0427, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0875, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0420, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0574, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0454, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0383, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0402, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0562, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0393, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0403, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0443, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0431, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0518, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0438, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0383, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0583, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0394, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0396, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0507, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0462, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0502, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0464, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0390, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0511, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0440, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0385, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0413, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0379, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0486, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0409, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0491, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0594, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0446, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0489, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0406, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0416, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0445, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0424, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0413, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0018, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0004, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0386, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0492, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0483, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0556, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0791, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0532, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0584, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0983, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0393, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0384, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0563, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0410, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0396, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0499, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0415, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0478, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0576, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0478, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0014, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0666, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0459, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0411, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0580, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0461, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0417, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0440, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0579, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0399, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0505, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0398, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0633, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0388, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0483, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0558, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0407, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0749, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0400, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0631, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0603, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0512, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0439, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0526, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0390, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0555, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0501, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0497, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1005, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0406, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0541, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0630, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0400, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0424, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0461, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0499, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0545, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0440, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0528, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0448, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0620, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0516, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0734, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0498, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0394, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0591, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0499, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0539, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0653, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0509, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0517, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0403, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0400, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0384, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0401, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0450, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0423, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0421, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0491, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0428, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0473, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0575, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0506, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0834, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0494, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0805, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0563, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0588, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0413, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0397, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0424, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0692, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0457, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0394, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0387, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0422, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0503, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0689, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0388, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0450, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0021, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0680, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0505, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0414, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0388, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0444, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0379, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0417, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0447, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0511, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0398, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0447, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0541, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0402, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0554, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0522, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0473, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0382, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0691, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0433, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0419, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0442, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0499, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0596, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0387, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0437, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0405, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0397, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0443, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0478, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0399, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0425, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0427, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0397, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0443, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0400, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0389, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0440, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0437, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0014, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0561, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0604, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0386, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0568, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0556, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0531, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0419, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0558, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0491, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0587, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0400, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0410, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0393, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0822, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0477, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0392, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0517, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0434, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0675, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0494, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0530, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0020, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0453, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0384, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0706, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0451, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0436, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0388, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0461, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0461, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0520, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0480, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0472, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0434, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0516, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0869, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0438, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0445, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0479, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0429, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0389, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0402, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0396, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0457, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0407, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0488, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0452, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0457, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0393, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0561, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0023, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0427, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0704, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0565, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0392, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0642, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0397, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0436, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0525, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0570, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0385, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0479, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0414, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0427, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0384, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0809, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0435, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0972, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0515, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0654, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0465, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0492, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0441, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0466, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0493, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0429, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0430, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0823, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0397, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0529, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0500, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0453, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0439, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0415, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0022, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0579, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0618, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0684, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0450, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0409, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0476, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0474, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0504, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0408, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0461, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0440, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0464, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0489, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0405, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0523, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0620, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0588, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0647, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0391, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0400, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0404, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0395, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0536, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0489, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0025, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0486, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0521, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0412, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0401, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0459, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0642, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0435, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0400, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0391, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0400, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0438, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0397, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0507, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0514, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0400, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0522, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0716, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0679, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0679, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0395, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0543, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0447, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0407, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0451, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0614, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0390, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0383, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0400, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0412, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0442, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0518, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0434, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0454, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0426, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0459, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0522, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0506, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0606, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0401, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0532, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0398, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0437, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0507, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0423, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0429, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0840, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0482, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0469, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0396, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0403, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0412, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0466, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0448, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0709, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0534, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0634, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0775, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0853, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0547, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0435, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0426, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0382, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0451, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0491, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0434, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0430, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0478, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0572, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0541, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0406, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0447, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0447, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0479, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0611, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0550, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0407, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0622, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0409, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0669, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0468, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0422, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0406, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0397, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0382, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0545, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0408, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0469, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0394, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0394, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0390, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0550, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0448, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0420, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0415, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0391, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0442, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0571, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0391, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0408, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0382, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0442, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0382, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0475, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0622, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0410, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0490, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0486, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0594, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0628, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0410, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0549, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0682, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0587, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0526, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0020, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0457, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0624, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0574, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0390, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0385, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0521, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0382, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0396, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0466, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0469, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0398, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0461, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0389, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0393, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0401, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0680, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0388, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0382, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0432, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0537, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0586, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0382, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0385, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0558, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0545, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0471, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0634, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0532, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0467, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0489, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0409, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0397, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0644, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0477, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0485, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0419, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0434, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0389, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0565, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0439, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0520, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0412, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0438, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0577, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0460, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0600, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0403, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0529, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0584, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0564, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0385, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0584, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0517, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0401, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0448, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0468, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0512, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0618, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0455, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0414, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0500, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0394, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0460, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0427, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0400, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0397, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0416, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0549, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0436, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0564, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0537, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0829, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0433, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0455, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0454, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0383, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0417, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0462, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0644, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0463, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0393, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0514, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0476, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0403, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0440, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0424, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0643, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0656, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0507, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0554, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0602, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0459, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0579, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0384, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0386, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0461, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0379, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0549, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0406, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0393, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0531, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0510, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0514, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0396, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0428, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0475, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0937, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0775, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0016, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0474, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0783, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0404, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0536, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0603, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0406, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0415, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0566, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0637, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0409, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0402, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0464, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0408, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0406, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0432, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0446, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0518, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0388, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0415, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0399, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0502, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0504, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0416, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0405, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0412, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0413, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0448, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0443, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0967, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0426, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0462, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0392, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0399, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0528, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0410, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0488, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0459, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0521, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0499, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0810, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0843, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0442, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0446, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0440, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0379, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0498, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0422, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0494, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0615, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0432, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0617, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0480, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0404, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0458, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0572, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0433, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0388, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0404, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0386, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0507, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0428, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0445, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0432, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0532, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0387, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0842, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0535, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0392, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0417, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0520, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0541, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0672, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0408, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0390, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0437, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0406, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0499, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0401, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0413, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0433, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0443, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0455, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0415, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0411, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0515, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0409, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0419, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0386, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0403, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0598, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0409, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0408, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0432, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0435, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0024, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0590, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0421, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0025, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0429, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0458, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0417, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0401, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0413, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0505, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0473, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0440, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0837, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0408, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0482, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0498, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0564, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0601, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0537, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0470, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0440, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0529, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0506, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0469, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0384, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0406, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0570, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0559, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0521, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0435, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0513, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0401, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0431, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0413, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0460, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0513, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0397, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0539, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0713, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0422, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0436, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0398, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0422, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0399, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0462, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0410, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0522, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0478, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0454, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0524, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0413, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0485, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0434, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0389, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0426, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0421, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0453, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0438, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0384, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0615, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0542, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0387, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0399, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0440, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0771, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0388, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0495, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0425, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0402, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0422, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0493, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0418, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0698, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0476, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0437, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0511, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0773, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0543, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0016, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0393, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0639, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0398, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0422, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0432, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0404, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0486, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0418, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0021, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0879, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0509, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0446, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0739, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0017, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0412, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0565, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0780, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0396, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0382, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0433, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0398, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0527, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0453, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0418, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0589, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0481, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0413, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0450, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0447, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0388, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0520, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0453, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0442, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0523, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0541, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0444, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0391, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0413, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0623, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0392, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0497, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0450, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0012, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0443, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0411, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0404, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0682, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0389, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0423, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0394, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0393, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0580, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0581, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0445, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0544, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0663, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0472, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0670, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0547, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0581, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0388, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0429, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0441, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0437, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0625, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0411, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0687, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0491, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0481, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0429, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0517, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0388, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0538, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0390, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0443, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0462, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0425, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0444, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0394, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0024, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0481, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0409, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0508, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0481, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0383, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0384, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0020, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0455, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0525, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0385, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0571, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0510, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0387, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0023, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0389, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0498, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0395, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0641, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0482, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0449, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0388, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0452, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0435, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0566, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0437, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0426, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0404, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0409, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0406, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0474, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0420, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0460, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0462, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0489, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0412, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0531, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0448, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0567, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0434, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0424, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0006, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0431, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0415, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0505, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0626, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0430, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0538, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0610, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0508, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0505, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0517, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0018, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0536, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(7.8581e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0001, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0509, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0009, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0021, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0598, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0444, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0610, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0552, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0400, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0609, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0615, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0449, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0395, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0016, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0608, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0014, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0014, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0015, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0418, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0519, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0450, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0396, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0487, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0512, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0024, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0522, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0458, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0002, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0007, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0391, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0651, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0467, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0436, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0424, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0403, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0396, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0540, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0437, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0467, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0621, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0391, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0669, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1004, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0468, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0421, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0004, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0527, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0496, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0471, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0542, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0531, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0615, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0023, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0388, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0791, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0392, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0447, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0432, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0459, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0499, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0019, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0423, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0011, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0408, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0022, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0008, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0007, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0010, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0007, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0007, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0007, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0015, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0018, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0019, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0006, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0025, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0005, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0005, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0025, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0005, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0022, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0004, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0005, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0004, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0004, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0003, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0004, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0003, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0003, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0002, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0003, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0003, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0002, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0002, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0002, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0002, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0008, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0002, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0002, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0002, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0002, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0003, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0002, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0001, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0001, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0001, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0001, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0001, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0001, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0001, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0001, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0012, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0001, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0002, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0012, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0001, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(9.9779e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0001, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0010, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0001, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0002, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(9.7430e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(9.9678e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0025, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0005, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(9.1286e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(8.7179e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(8.3858e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(8.4949e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(9.1815e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(7.9648e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(8.6552e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0002, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0012, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(8.1154e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0012, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0001, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(7.4864e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(7.3731e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(9.2387e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(7.8785e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(7.3992e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(7.0127e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0008, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0006, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(8.8522e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(7.0379e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0001, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0006, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(6.4692e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(6.6166e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0008, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0006, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0006, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0019, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(7.9728e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0007, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0003, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(6.1468e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0007, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0001, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0007, device='cuda:0', grad_fn=<NllLossBackward>)
--------------EPOCH 2-------------
Test Accuracy: tensor(0.4262, device='cuda:0')
Loss: tensor(4.5487, device='cuda:0', grad_fn=<NllLossBackward>)
Saving model at iteration: 0
Loss: tensor(5.7832, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(3.9276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(3.8762, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(3.9753, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(2.9521, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(1.9261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(3.1781, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(2.1003, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(1.5626, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(1.9446, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(1.8102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.9792, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(1.1779, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(1.2066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.7109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.7802, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.8350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.6429, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.4215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.5152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.3034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.3978, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.3165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.2850, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.3533, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.2428, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.2707, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1507, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1701, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.2472, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1382, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0907, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1473, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1003, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1488, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0749, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0884, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1687, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0827, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0653, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0742, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0757, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0654, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0551, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0516, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1019, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0913, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0918, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1919, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0734, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0568, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0620, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0627, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0666, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0396, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0935, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0742, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0958, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0875, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0678, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0886, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0749, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0592, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0657, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0757, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1010, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0948, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0424, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0912, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0661, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0575, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0575, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0891, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0703, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0734, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0884, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0961, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0646, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0488, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0937, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1013, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0942, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0765, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0631, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0703, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0656, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0821, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0622, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0834, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0830, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0467, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1007, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0900, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0700, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0762, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0611, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0917, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0682, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0933, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0896, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0829, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0570, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0445, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0687, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0913, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0997, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0670, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0800, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0640, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0681, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0927, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0425, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0837, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0755, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0675, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0964, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0554, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0678, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0486, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1713, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0999, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0654, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0659, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0936, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0757, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0622, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0753, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0723, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0671, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0754, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0638, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0563, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0992, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0539, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0836, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1000, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0476, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0502, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0988, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0481, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0444, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0737, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0449, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0577, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0427, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0441, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0791, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0633, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0929, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0982, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0776, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0682, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0729, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0635, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0686, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0844, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0667, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0509, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0785, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0875, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0780, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0578, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0900, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0673, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0792, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0710, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0379, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0843, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0615, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0528, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0400, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0565, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0597, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0577, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0671, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0575, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0439, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0418, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0818, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0744, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0497, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0492, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0585, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0414, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0949, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0930, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0510, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0681, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0937, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0858, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0447, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0550, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0735, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0575, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0717, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0681, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0727, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0705, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1493, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0534, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0589, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0624, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0486, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0789, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0635, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0802, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0494, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0623, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0793, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0505, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0535, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0557, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0837, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0809, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0722, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0871, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1025, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0780, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0853, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0946, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0481, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0601, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0758, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0695, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0529, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.3291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0809, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0965, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0385, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0602, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0684, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0633, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0576, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0572, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0989, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0601, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0541, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0824, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0826, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0637, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0752, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0411, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0560, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0419, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0494, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0615, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0665, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0831, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0432, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0640, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0606, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0881, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0462, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0914, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0763, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0709, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0804, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0740, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0477, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0717, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0553, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0513, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0507, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0475, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0478, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0851, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0771, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0597, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0575, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0608, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0799, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0454, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0793, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0769, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0449, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0782, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0602, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0465, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0438, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0627, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0608, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0606, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0598, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0729, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0929, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0486, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0687, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0907, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0690, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0715, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0672, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0669, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0447, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0489, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0714, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0727, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0704, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0488, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0611, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0708, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0779, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0657, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0571, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0703, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0578, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0566, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0757, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0638, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0590, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0608, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0661, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0529, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0636, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0692, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0398, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1018, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0617, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0536, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0821, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0456, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0806, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0710, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0497, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0865, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0442, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0567, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0571, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0430, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0648, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0606, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0814, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0591, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0735, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0463, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0471, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0881, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0870, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0725, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0504, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0465, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0831, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0606, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1009, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0625, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0439, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0484, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1020, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0620, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0583, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0760, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0718, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0571, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0783, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0606, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0864, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0829, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0611, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0618, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0709, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0650, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0590, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0530, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0508, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0499, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0796, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0463, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0789, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0557, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0973, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0652, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0577, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0600, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0698, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0736, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0921, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0831, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0754, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0609, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0806, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0510, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0671, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0442, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0603, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0629, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0821, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0593, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0699, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0550, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0389, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0629, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0543, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0537, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0790, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0700, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0384, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1005, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0665, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0687, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0597, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0545, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0733, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0502, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1022, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0974, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0426, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1001, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0469, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0983, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0417, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0709, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0600, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0698, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0407, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0806, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0513, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0383, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0631, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0491, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0563, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0701, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0700, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0798, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0549, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0517, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0659, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0782, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0795, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0992, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0699, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0648, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0509, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0602, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0578, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0701, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0629, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0958, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0582, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0690, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0607, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0438, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0595, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0504, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0844, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0384, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0541, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0608, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0690, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0820, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0624, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0821, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0855, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0554, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0909, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0729, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0553, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0758, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0876, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0572, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0761, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0773, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0500, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0549, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0489, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0623, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0590, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0466, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0656, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0578, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0717, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0393, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0685, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0468, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0452, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0435, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0454, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0532, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0479, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0635, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0884, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0412, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0534, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0682, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0497, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0905, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0477, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0710, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0693, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0440, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0848, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0453, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0415, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0656, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0991, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0771, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0929, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0910, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0724, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0530, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0534, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0726, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0581, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0673, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0684, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0440, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0461, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0470, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0570, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0741, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0411, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0567, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0475, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0711, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0729, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0794, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0518, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0573, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0887, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0405, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0677, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0940, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0474, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0896, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0523, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0555, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0547, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0702, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0620, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0711, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0538, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0544, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0496, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0414, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0848, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0695, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0902, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0838, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0595, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0559, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0544, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0698, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0669, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0688, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0962, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0597, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0640, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0469, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0659, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0682, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0914, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0579, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0853, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0747, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0478, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0498, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0667, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0446, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0793, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0553, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0736, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0447, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0890, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0549, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0949, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0637, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0673, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0712, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0735, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0456, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0673, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0846, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0432, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0609, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0900, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0651, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0582, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0804, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0755, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0544, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0576, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0447, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0753, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0518, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0854, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0863, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1008, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0620, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0560, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0883, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0650, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0903, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0710, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0614, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0465, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0824, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1433, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0635, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0900, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0421, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0484, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0557, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0616, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0742, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0493, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0809, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0742, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0599, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0487, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0621, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0753, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0550, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0606, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0597, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0579, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0417, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0588, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0755, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0675, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0586, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0522, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0438, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0858, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0761, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0943, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0565, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0690, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0680, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0833, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0688, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0463, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0452, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0722, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0442, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0504, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0489, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0832, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0753, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0584, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0628, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0556, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0436, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0383, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0566, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0554, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0458, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0439, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0891, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0848, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0565, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0686, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0617, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0407, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0385, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0625, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0667, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0709, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0652, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0498, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0587, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0796, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0798, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0508, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0737, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0754, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0740, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0468, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0732, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0515, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0719, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0594, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0537, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0412, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0572, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0779, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0499, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0562, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0522, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0722, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0589, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0755, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0815, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0849, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0833, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0895, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0820, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0789, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0540, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0556, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0579, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0515, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0520, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0703, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0535, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0510, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0612, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0528, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0842, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0861, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0731, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0957, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0669, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0827, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0793, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0757, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0639, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0831, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0736, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0653, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0405, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0746, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0671, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0564, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0430, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0642, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0553, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0514, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0494, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0935, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0578, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0872, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0671, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0530, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0584, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0450, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0941, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0471, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0803, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0540, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0503, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0685, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0828, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0641, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0424, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0731, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0711, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0722, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0404, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0400, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0710, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0734, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0443, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0499, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0875, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0766, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0458, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0553, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0526, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0651, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0816, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0487, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0393, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1004, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0396, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0448, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0552, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0868, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0388, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0923, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0652, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0660, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0652, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0447, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0532, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0662, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0746, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0698, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0788, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0784, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0449, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0830, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0520, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0889, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0654, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0478, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0783, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0557, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0630, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0671, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0651, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0814, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0564, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0688, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0683, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0420, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0763, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0768, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0955, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0673, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0769, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0621, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0444, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0555, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0583, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0456, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0396, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0937, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0623, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0544, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0593, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0769, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0842, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0584, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0864, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0574, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0398, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0464, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0783, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0556, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0382, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0959, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0451, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0491, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0468, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0462, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0673, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0379, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0745, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0652, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0399, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0418, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0435, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0678, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0594, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0493, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0479, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0471, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0579, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0447, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0580, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0647, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0528, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0484, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0891, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0904, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0735, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0661, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0570, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0730, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0793, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0874, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0554, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0694, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0509, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0386, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0724, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0868, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0714, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0775, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0522, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0878, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0976, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0432, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0631, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0650, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1018, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0875, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1004, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0477, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0659, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0505, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0774, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0865, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0749, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0564, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0470, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0701, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0585, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0538, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0564, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0686, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0735, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0862, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0721, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0781, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0550, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0540, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0425, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0446, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0598, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0695, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0683, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0434, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0636, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0836, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0842, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0445, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0703, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0441, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0854, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0444, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0729, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0410, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0515, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0664, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0681, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0425, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0795, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0617, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0795, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0673, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0973, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0761, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0389, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1413, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0557, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0753, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0485, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0426, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0702, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0542, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0633, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0788, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0698, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0515, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0578, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0630, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0504, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0500, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0988, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0620, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0395, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0399, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0624, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0457, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0670, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0423, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0508, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0571, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0391, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0680, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0545, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0388, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0877, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0862, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0729, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0539, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0895, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0716, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0440, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0445, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0436, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0613, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0687, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0831, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0525, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0560, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0585, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0766, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0680, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0589, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0573, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0513, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0720, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0416, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0427, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0408, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0624, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0536, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0483, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0779, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0420, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0517, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0685, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0481, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0541, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0504, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0605, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0575, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0942, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0785, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0667, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0513, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0389, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0392, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0466, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0828, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0602, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0586, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0408, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0385, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0822, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0626, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0607, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0449, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0492, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0774, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0654, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0688, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0501, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0628, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0727, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0598, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0502, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0905, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0549, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0559, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0418, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0464, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0607, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0430, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0434, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0454, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0515, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0605, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0595, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0501, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0415, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0564, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1467, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0791, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0390, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0884, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0798, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0403, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0590, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0810, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0616, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0600, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0734, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0416, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0752, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0919, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0693, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0702, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0529, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0489, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0384, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1408, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1020, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0700, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0558, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0529, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0525, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0958, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0610, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0894, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0630, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0834, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0657, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0596, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0399, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0767, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0451, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0767, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0672, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0474, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0845, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0909, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0978, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0576, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0709, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0555, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0735, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0914, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0704, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0828, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0492, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0784, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0924, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0739, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0524, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0541, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0420, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0600, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0636, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1018, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0669, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0767, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0564, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1002, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0733, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0808, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0686, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0596, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0728, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0659, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0524, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0461, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0735, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0399, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0468, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0571, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0694, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0821, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0458, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0638, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0577, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0564, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0942, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0643, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0942, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1022, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0572, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0609, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0822, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0964, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0583, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0705, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0557, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0963, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0762, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0850, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0632, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0684, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0474, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0433, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0492, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0515, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0691, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0536, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0656, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0500, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0695, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0635, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0437, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0554, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0789, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0487, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0563, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0586, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0558, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0550, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0649, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0390, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0414, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0557, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0740, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0843, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0428, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0851, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0732, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0664, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0844, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0624, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0593, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0547, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0413, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0839, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0975, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0718, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0533, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0558, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0633, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0643, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0500, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0546, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0578, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0742, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0537, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0817, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0556, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0530, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0643, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0546, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0685, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0471, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0663, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0546, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0487, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0615, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0648, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0527, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0487, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0696, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0887, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0494, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0717, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0439, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0646, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0694, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1003, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0591, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0700, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0578, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0785, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0826, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0775, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0524, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0694, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0427, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0671, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0871, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0641, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0414, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0603, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0630, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0874, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0703, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0385, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0453, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0616, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0697, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0383, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0712, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0703, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0398, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0722, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0673, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0579, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0588, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0646, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0597, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0402, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0747, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0630, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0545, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0382, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0458, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0669, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0448, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0935, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0760, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0494, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0436, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0695, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0425, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0418, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0610, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0652, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0459, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0467, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0429, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0395, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0531, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0439, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0775, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0767, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0615, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0539, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0623, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0554, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0411, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0632, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0766, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0425, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0647, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0702, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0477, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0460, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0478, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0595, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0747, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0660, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0766, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0538, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0486, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0392, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0531, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0887, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0591, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0473, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0434, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0389, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0700, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0635, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0595, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0430, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0452, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0478, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0731, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0571, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0474, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0804, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0438, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0521, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0460, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0458, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0542, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0481, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0747, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0744, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0450, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0737, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0847, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0679, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0721, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0567, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0769, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0803, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0482, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0440, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0389, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0466, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0448, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0766, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0650, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0862, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0615, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0708, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0489, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0706, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0823, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0580, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0615, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0603, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0567, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0397, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0537, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0581, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0410, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0394, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0627, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0948, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0511, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0703, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0610, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0733, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0740, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0746, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0486, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0458, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0665, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0761, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0693, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0572, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0869, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0445, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0582, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0621, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0564, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0632, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0581, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0485, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0947, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0880, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0621, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0706, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0597, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0793, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0740, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0684, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0555, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0942, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0584, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0804, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0827, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0556, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0452, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0986, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0822, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0536, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0551, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0437, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0611, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0452, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0903, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0781, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0463, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0525, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0699, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0925, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0408, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0533, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0531, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0457, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0387, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0598, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0801, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0538, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0765, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0688, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0458, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0495, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0776, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0620, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0484, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0528, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0694, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0638, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0600, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0724, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0414, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0610, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0421, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0648, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0482, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0464, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0523, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0679, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0830, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0541, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0515, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0576, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0663, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0538, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0591, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0491, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0580, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0503, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0451, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0786, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0509, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0733, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0578, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0468, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0597, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0595, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0992, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0724, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0442, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0403, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0413, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0488, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0600, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0636, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0493, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0465, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0523, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0428, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1012, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0556, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0715, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0468, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0495, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0917, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0585, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0720, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0605, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0765, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0997, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0516, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0559, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0599, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0601, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0686, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0618, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0633, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0629, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0875, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0481, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0561, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0968, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0777, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0827, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0711, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0887, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0460, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0714, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0732, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0521, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0857, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0388, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0540, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0857, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0720, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0518, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0778, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0756, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0431, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0386, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0600, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0732, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0652, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0851, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0415, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0496, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0398, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0646, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0591, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0505, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0728, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0981, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0625, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0475, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0728, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0581, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0415, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0706, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0634, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0387, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0546, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0430, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1012, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0394, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0723, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0541, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0760, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0665, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0798, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0622, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0648, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0516, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0965, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0607, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1005, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0539, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0586, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0515, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0406, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0574, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0822, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0389, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0588, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0451, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0934, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0464, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0744, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0600, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0553, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0709, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0554, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0557, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0793, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0468, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0535, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0640, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0600, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0714, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0668, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0830, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0888, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0499, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0668, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0965, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0611, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0478, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0472, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0449, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0631, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0705, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0634, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0426, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0409, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0589, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0424, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0677, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0543, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0458, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0528, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0558, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0382, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0423, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0692, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0563, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0591, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0549, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0759, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0542, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0793, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0672, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0472, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0826, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0410, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0661, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0788, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0717, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0676, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0833, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0671, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0879, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0671, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0728, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0822, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0849, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0923, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0458, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0755, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0526, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0971, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0607, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0824, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0422, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0463, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0795, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0595, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0567, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0747, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0700, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0433, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0458, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0508, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0481, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0450, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0557, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0495, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0735, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0451, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0491, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0465, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0580, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0678, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0576, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0701, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0468, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0773, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0617, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0395, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0742, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0697, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0642, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0479, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0736, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0544, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0401, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0408, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0479, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0588, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0504, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0616, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0543, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0556, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0459, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0480, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0431, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0646, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0536, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0660, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0419, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0830, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0917, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0427, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0527, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0405, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0624, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0899, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0706, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0604, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0715, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0519, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0663, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0521, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0738, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0587, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0662, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0699, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0484, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0427, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0745, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0453, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0525, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0894, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0776, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0448, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0869, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0395, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0659, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0562, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0510, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0457, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0682, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0447, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0626, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0543, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0628, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0807, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0390, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0447, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0632, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0391, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0565, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0806, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0503, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0398, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0632, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0387, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0397, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0686, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0414, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0544, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0444, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0799, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0950, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0858, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0534, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0716, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0431, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0408, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0523, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0442, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0510, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0383, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0409, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0485, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0791, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0447, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0519, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0708, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0791, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0737, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0633, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0942, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0541, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0644, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0577, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0491, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0604, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1015, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1823, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0698, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0384, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0571, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0890, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0480, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0465, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0856, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0498, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0594, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0616, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0536, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0436, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0637, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0597, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0579, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0914, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0705, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0460, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0585, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0802, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0513, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0693, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0862, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0701, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0593, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0742, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0558, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0501, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0573, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0529, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0877, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0402, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0935, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0504, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0408, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0472, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0630, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0456, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0710, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0422, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0550, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0761, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0624, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0704, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0745, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0521, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0805, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0685, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0792, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0920, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0555, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0709, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0456, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0469, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0475, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0412, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0841, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0512, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0625, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0981, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0647, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0390, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0955, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0613, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0399, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0860, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0574, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0468, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0604, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0631, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0645, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0390, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0980, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0524, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0558, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0557, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0859, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0467, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0498, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0456, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0681, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0427, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0389, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0970, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0402, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0774, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0882, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0970, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0616, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0524, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0839, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0530, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0737, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0479, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0470, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0455, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0646, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0441, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0419, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0619, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0695, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0552, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0650, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0524, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0493, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0702, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0764, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0669, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0693, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0670, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0730, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0913, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0574, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0576, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0524, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0696, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0669, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0502, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0564, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0805, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0547, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0521, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0733, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0604, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0710, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0485, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0774, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0773, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0808, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0538, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0749, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0655, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0965, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0389, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0478, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0759, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0633, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0390, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0703, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0388, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0907, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0426, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0393, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0414, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0405, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0539, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0648, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0606, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0410, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0658, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0500, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0409, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0790, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0610, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0435, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0748, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0846, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0671, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0472, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0566, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0475, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0731, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0708, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0568, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0568, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0536, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0641, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0535, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0579, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0867, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0399, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0602, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0513, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0498, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0617, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0422, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0808, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0758, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0426, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0637, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0894, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0566, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0646, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0384, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0883, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0652, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0695, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0649, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0515, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0492, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0824, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0480, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0516, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0627, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0713, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0467, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0519, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0554, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0613, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0557, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0659, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0431, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0681, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0466, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0444, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0790, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0550, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0604, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0534, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0627, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0466, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0720, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0552, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0552, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0507, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0707, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0599, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0478, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0912, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0925, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0631, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0732, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0517, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0707, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0721, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0863, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0449, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0497, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0656, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0823, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0703, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0581, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0623, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0466, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0791, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0550, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0983, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0897, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0513, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0520, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0473, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0402, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0681, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0616, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0718, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0538, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0591, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0901, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0620, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0425, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0465, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0458, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0442, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0477, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0728, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0554, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0539, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0667, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0902, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0824, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0510, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0561, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0779, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0490, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0514, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0592, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0402, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0484, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0592, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0668, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0541, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0741, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0591, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0491, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0625, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0756, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0674, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0562, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0546, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1397, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0521, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0454, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0507, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0553, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0917, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0963, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0619, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0772, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0864, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0790, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0914, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0592, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0586, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0464, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0708, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0648, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0480, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0590, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0494, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0500, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0808, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0736, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0596, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0616, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0792, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0498, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0388, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0800, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0718, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0603, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0496, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1410, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0495, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0710, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0749, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0753, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0571, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0417, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0413, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0563, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0552, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0798, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0631, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0420, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0416, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0788, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0793, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0524, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0809, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0602, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0651, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0481, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0592, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0708, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0716, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0638, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0800, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0596, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0548, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0615, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0552, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0900, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0613, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0422, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0661, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0680, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0998, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0964, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0637, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0729, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0488, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0449, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0441, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0677, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0468, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0550, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0902, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0599, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0602, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0553, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0520, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0611, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0739, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0415, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0562, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0479, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0443, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0920, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0802, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0458, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0529, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1425, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0531, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0511, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0668, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0524, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0858, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0734, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0618, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0573, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0541, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0657, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0735, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0493, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0668, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0388, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0640, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0809, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0604, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0379, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0534, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0488, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0484, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0552, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0450, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0672, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0634, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0899, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0644, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0617, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0572, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0550, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0497, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0641, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0690, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0674, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0513, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0659, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0379, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0491, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0390, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0637, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0561, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0840, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0385, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0387, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0446, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0680, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0491, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0563, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0841, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0442, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0536, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0527, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0503, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0576, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0615, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0494, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0492, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0682, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0536, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0477, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0676, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0972, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0532, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0657, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0808, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0456, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0620, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0503, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0492, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0763, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0883, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0528, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0708, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0628, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0520, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1017, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0400, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0517, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0585, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0471, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0902, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0464, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0878, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0804, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0918, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0522, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0599, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0823, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0820, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0438, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0750, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0874, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0528, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0766, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0738, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0498, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0492, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0432, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0449, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0541, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0539, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0596, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0467, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0426, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0463, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0447, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0534, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0503, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0899, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0427, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0558, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0536, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0532, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0490, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0681, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0706, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0536, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0563, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0446, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0445, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0390, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0975, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0555, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0561, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0902, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0606, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0425, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0460, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0489, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0647, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0796, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0657, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0510, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0536, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0620, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0527, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0881, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0482, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0799, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0557, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0988, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0607, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0423, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0494, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0582, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0664, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0507, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0658, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0883, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0813, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0484, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0735, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0762, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0617, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0529, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0567, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0588, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0674, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0453, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0909, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0997, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0408, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0517, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0382, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0502, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0541, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0948, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0543, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0546, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0391, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0489, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0753, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0517, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0653, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0834, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0553, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0955, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0402, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0646, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0689, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0436, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0545, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0621, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0695, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0779, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0839, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0558, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0883, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0654, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0525, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0606, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0801, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0511, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0570, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0512, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0488, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0727, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0614, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0385, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0825, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0485, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0597, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0625, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0439, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0406, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0684, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0627, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0781, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0598, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0466, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0438, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0496, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0383, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0521, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0429, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0489, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0562, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0685, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0415, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0492, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0767, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0611, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0495, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0920, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0720, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0382, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0601, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0613, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0691, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0476, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0618, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0709, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0946, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0536, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1001, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0738, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0860, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0402, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0502, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0674, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0846, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0713, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0875, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0964, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0526, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0442, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0437, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0640, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0691, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0669, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0751, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0504, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0636, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0451, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0422, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0675, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0785, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0740, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0687, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0409, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0845, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0490, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0390, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0671, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0678, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0653, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0604, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0712, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0772, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0704, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0530, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0441, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0761, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0447, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0513, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0822, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0498, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0518, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0891, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0846, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0882, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0430, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0883, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0901, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0651, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0960, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0621, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0578, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0775, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0978, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0722, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0507, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0410, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0627, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0600, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0750, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0570, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0445, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0425, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0632, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0532, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0427, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0623, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0669, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0784, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0445, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0750, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0394, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0501, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0587, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0672, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0977, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0581, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0629, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0585, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0989, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0423, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0759, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0737, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0522, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0478, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0822, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0608, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0389, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0754, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0767, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0535, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0671, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0557, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0812, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0897, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0645, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0624, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0864, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0413, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0723, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0861, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0622, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0587, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0933, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0446, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0518, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0446, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0416, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0620, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0404, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0957, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0621, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0420, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0578, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0614, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0722, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0635, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0721, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0453, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0697, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0630, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0471, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0830, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0780, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0391, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0457, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1022, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0598, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0567, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0538, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0546, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0479, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0597, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0701, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0699, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0583, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0526, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0501, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0534, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0585, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0806, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0570, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0414, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0971, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0773, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0459, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0459, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0774, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0593, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0632, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0616, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0475, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0453, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0611, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0460, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0710, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0518, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0496, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0839, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0979, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0531, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0544, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0908, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0669, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0775, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0909, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0637, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0666, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0508, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0411, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0446, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0941, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0853, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0415, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0404, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0777, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0672, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0861, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0436, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0785, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0803, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0658, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0838, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0678, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0584, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0504, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0644, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0701, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0916, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0938, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0555, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0590, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0776, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0486, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0720, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0725, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0427, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0553, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0624, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0571, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0717, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0412, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0754, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0967, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0517, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0433, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0687, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0953, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0847, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0508, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0563, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0583, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0762, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0614, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0598, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0757, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0501, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0432, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0475, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0661, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0593, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0592, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0672, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0740, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0533, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0695, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0768, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0602, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0573, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0593, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0437, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0405, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0631, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0621, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0721, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0573, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0713, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0676, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0557, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0644, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0725, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0458, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0760, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0484, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0659, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0695, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0625, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0639, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0482, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0562, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0575, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0745, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0680, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0499, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0558, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0770, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0594, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0621, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0609, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0731, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0999, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0929, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0400, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0557, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0425, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0541, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0528, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0640, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0886, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0659, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0477, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0798, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0535, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0805, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0476, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0577, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0779, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0558, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0982, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0600, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0769, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0484, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0618, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0456, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0773, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0672, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0448, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0410, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0452, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0635, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0598, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0706, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0722, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0584, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0719, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0663, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0521, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0671, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0562, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0781, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0571, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0502, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0505, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0389, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0796, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0547, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0427, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0809, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0487, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0561, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0859, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0608, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0567, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0453, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0522, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0629, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0767, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0678, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0834, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0756, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0465, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0528, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0728, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0610, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0630, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0982, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0434, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0571, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0442, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0619, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0730, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0410, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0593, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0604, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0415, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0469, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0718, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0743, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0520, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0538, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0917, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0579, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0739, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0621, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0563, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0437, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0528, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0682, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0825, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0620, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0715, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0406, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0601, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0712, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0546, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0624, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0566, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0887, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0964, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0508, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0983, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0801, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0423, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0440, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0497, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0653, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0846, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0589, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0419, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0613, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0553, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0711, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0780, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0731, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0429, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0389, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0705, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0586, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0442, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0467, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0699, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0598, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0714, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0531, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0643, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0633, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0495, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0707, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0708, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0595, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0479, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0729, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0852, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0513, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0572, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0960, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0507, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0583, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0679, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0753, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0938, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0570, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0624, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0577, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0831, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0411, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0444, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0584, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0600, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0506, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0473, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0708, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0414, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0815, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0890, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0563, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0615, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0680, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0491, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0507, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0697, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0382, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0490, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0994, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0506, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0649, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0559, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0797, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0467, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0614, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0932, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0507, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0427, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0568, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0964, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0652, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0445, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0637, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0751, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0601, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0719, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0594, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0678, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0550, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0621, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0765, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0704, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0934, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0660, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0452, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0673, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0533, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0753, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0607, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0891, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0519, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0808, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0545, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0580, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0598, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0571, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0477, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0483, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0798, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0400, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0736, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0592, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0816, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0454, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0724, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0694, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0438, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0614, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0711, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0550, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0814, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0635, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0616, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0543, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0814, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0548, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0422, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0678, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0425, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0443, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0458, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0593, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0864, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0701, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0559, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0442, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0819, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0683, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0773, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0537, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0440, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0469, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0517, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0677, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0408, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0595, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0532, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0478, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0468, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0811, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0567, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0516, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0804, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0702, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0614, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0679, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0439, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0643, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0603, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0682, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0490, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0484, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0559, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0636, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0607, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0942, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0623, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0617, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0479, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0555, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0627, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0621, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0429, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0508, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0437, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0653, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0545, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0800, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0586, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0524, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0712, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0420, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0605, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0691, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0645, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0982, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0487, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0541, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0727, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0498, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0696, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0513, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0501, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0754, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0554, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0468, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0397, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0663, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0878, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0582, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0697, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0572, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0485, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0694, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0913, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0837, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0768, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0484, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0670, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0529, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0440, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0638, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0835, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0552, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0513, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0457, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0542, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0600, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0590, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0536, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0622, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0386, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0450, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0455, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0580, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0401, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0792, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0385, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0490, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0659, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0478, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0582, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0603, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0429, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0887, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0677, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0557, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0385, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0412, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0786, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0887, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0704, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0402, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0535, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0508, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0598, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0750, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0783, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0497, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0626, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0679, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0680, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0729, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0768, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0564, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0599, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0544, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0445, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0518, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0693, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0534, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0825, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0551, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0483, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0548, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0778, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0489, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0956, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0619, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0387, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0682, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0701, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0484, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0849, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0562, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0531, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0400, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0751, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0726, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0654, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0495, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0765, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0627, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0470, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0648, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0787, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0723, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0458, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0387, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0451, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0769, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0487, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0649, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0525, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0551, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0624, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0552, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0898, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0398, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0753, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0678, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0750, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0526, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0936, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0647, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0508, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0616, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0644, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0578, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0563, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0881, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0592, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0834, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0955, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0385, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0468, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0549, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0596, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0664, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0477, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0512, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0671, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0574, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0500, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0920, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0445, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0709, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0902, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0542, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0590, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0495, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0403, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0830, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0486, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0429, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0712, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0482, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0621, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0631, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0644, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0539, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0436, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0456, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0884, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0556, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0918, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0492, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0637, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0503, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0532, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0635, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0443, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0725, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0921, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0562, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0704, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0573, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0629, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0523, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0482, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0379, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0461, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0529, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1003, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0584, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0666, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0706, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0469, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0559, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0534, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0533, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0576, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0695, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0535, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0847, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0399, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0562, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0414, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0644, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0423, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0691, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0740, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0559, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0452, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0633, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0903, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0490, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0413, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0469, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0417, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0413, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0641, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0456, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0567, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0540, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0606, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0445, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0654, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0537, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0466, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0502, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0506, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0547, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0504, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0840, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0438, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0886, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0526, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0751, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0686, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0792, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0410, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0483, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0843, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0684, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0564, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0612, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0568, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0938, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0406, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0735, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0463, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0483, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0475, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0635, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0696, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0782, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0683, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0533, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0588, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0615, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0471, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0735, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0428, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0416, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0551, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0460, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0392, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0553, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0682, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0623, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0587, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0423, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0792, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0446, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0500, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0709, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0524, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0389, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0613, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0511, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0519, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0558, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0582, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0571, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0594, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0477, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0624, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0468, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0806, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0723, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0718, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0817, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0848, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0876, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0598, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0514, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0665, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0519, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0504, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0931, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0965, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0961, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0804, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0485, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0582, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0447, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0502, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0387, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0695, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0653, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0773, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0496, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0982, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0632, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0414, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0642, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0452, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0415, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0514, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0745, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0390, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0936, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0699, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0385, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0664, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0665, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0564, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0631, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0431, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0379, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0429, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0605, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0875, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0507, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0613, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0634, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0761, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0550, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0456, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0562, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0718, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0665, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0772, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0538, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0569, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0624, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0647, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0459, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0697, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0713, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0382, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0861, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0479, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0472, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0384, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0529, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0621, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0494, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0562, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0597, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0639, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0544, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0564, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0842, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0685, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0782, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0562, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0693, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0512, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0654, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0525, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0722, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0581, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0881, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0490, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1430, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0437, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0484, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0448, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0407, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0584, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0520, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0425, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0576, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0753, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0466, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0427, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0682, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0531, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0535, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0426, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0472, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0440, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0533, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0556, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0584, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0566, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0588, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0395, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0529, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0391, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0795, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0607, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0535, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0565, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0677, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0453, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0670, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0395, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0419, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0569, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0557, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0860, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0600, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0692, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0513, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0528, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0720, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0884, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1418, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0613, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0684, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0435, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0456, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0418, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0544, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0749, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0734, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0422, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1024, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0694, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0462, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0494, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0579, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0757, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0537, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0413, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0531, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0450, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0437, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0445, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0722, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0392, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0420, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0535, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0725, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0454, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0480, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0483, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0468, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0463, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0776, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0559, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0501, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0781, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0420, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0900, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0697, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0508, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0495, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0478, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0425, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0744, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0765, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0530, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0505, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0468, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0397, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0461, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0815, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0621, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0420, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0411, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0511, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0638, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0568, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0467, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0824, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0528, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0510, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0816, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0460, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0696, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0700, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0601, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0807, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0422, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0805, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0747, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0723, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0627, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0565, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0448, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0545, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0571, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0703, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0595, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0461, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0780, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0766, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0636, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0499, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0665, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0563, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0603, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0437, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0565, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0758, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0517, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0605, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0625, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0570, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0650, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0442, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0449, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0819, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0664, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0539, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0522, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0735, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0490, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0765, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0680, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0660, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0657, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0897, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0544, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0459, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0413, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0672, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0408, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0477, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0512, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0466, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0732, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0598, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0489, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0765, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0440, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0418, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0403, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0739, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0803, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0783, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0651, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0614, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0538, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0771, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0663, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0462, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0651, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0479, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0442, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0462, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0729, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0450, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0845, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0517, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0547, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0415, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0866, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0918, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0554, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0701, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0481, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0430, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0524, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0792, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0541, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0626, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0620, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0519, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0628, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0530, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0643, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0438, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0441, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0410, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0521, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0555, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0633, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0855, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0550, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0529, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0428, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0659, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0453, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0450, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0682, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0656, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0529, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0968, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0489, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0660, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0537, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0593, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0450, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0662, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0414, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0517, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0926, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0621, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0817, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0543, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0688, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0466, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0389, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0767, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0543, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0671, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0579, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0518, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0449, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0664, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0945, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0611, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0704, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0532, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0694, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0480, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0498, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0532, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0384, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0517, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0810, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0643, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1015, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0799, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0653, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0781, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0494, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0399, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0561, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0584, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0690, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0725, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0721, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0388, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0777, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0418, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0549, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0771, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0660, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0704, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0534, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0402, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0948, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0542, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0705, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0607, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0603, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0645, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0692, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0602, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0533, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0528, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0861, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0909, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0791, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0582, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0486, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0603, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0882, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0742, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0585, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0841, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0497, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0735, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0606, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0606, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0516, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0587, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0631, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0475, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0537, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0526, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0876, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0498, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0892, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0427, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0412, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0438, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0647, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0734, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0554, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0832, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0943, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0842, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0682, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0813, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0627, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0470, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0519, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0519, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0414, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0599, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1018, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0764, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0655, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0420, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0708, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0694, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0557, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0643, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0546, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0621, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0691, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0851, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0535, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0671, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0865, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0673, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0800, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0545, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0506, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0429, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0596, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0937, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0403, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0502, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0405, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0676, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0453, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0499, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0465, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0790, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0484, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0527, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0568, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0620, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0682, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0460, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0667, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0616, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0449, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0379, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0596, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0575, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0931, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0463, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0689, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0674, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0967, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0672, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0436, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0563, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0490, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0893, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0382, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0713, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0952, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0787, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0613, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0724, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0396, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0755, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0745, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0937, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0708, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0404, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0950, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0391, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0673, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0477, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0575, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0560, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0492, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0951, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0647, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0588, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0736, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0462, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0715, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0542, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0659, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0424, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0759, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0618, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0549, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0746, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0775, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0393, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0557, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0779, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0674, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0405, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0397, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0488, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0503, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0459, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0436, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0643, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0598, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0540, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0629, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0387, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0392, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0719, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0732, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0422, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0489, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0721, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0737, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0538, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0859, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0987, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0596, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0809, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0729, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0391, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0532, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0411, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0741, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0609, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0534, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0800, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0596, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0553, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0501, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0538, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0478, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0450, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0433, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0981, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0718, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0573, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0506, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0379, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0601, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0634, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0872, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0973, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0575, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0599, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0484, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0645, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0630, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0528, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0656, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0590, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0744, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0981, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0525, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0431, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0624, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0435, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0823, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0580, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0615, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0624, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0505, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0764, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0643, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0463, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0480, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0416, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0747, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0617, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0634, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0895, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0548, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0661, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0890, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0617, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0389, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0815, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0622, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0393, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0445, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0462, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0710, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0706, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0772, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0515, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0814, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0473, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0690, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0882, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0542, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0602, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0526, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0698, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0481, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0629, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0384, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0699, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0655, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0399, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0629, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0696, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0567, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0404, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0772, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0474, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0713, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0736, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0726, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0419, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0981, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0774, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0459, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0871, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0893, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0479, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0446, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0417, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0507, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0607, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0477, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0723, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0578, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0435, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0520, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0765, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0467, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0643, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0485, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0520, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0568, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0458, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0545, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0667, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0432, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0387, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0591, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0414, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0671, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0433, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0857, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0568, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0596, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0775, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0546, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0558, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0382, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0888, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0412, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0707, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0548, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0709, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0659, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0672, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0452, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0561, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0447, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0695, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0393, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0825, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0621, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0420, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0497, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0524, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0793, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0701, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0719, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0435, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0599, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0743, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0429, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0490, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0495, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0451, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0845, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0385, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0544, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0412, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0387, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0718, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0425, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0776, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0563, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0743, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0656, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0756, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0548, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0572, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0480, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0468, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0452, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0415, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0445, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0918, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0392, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0586, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0872, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0524, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0480, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0393, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0435, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0479, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0711, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0587, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0506, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0424, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0815, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0417, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0830, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0494, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0925, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0409, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0420, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0825, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0514, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0635, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0746, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0590, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0513, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0774, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0435, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0477, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0662, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0799, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0935, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0461, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0536, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0516, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0773, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0604, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0786, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0925, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0518, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0569, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0918, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0625, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0573, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0648, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0632, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0489, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0906, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0385, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0526, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0713, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0442, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0433, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0578, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0460, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0577, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0754, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0636, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0937, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0640, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0418, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0603, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0433, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0534, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0834, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0430, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0410, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0740, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0559, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0533, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0398, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0573, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0655, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0817, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0527, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0664, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0713, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0834, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0501, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0876, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0485, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0468, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0778, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0423, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0563, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0467, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0418, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0639, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0618, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0502, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0440, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0699, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0543, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0658, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0514, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0530, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0607, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0524, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0692, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0466, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0532, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0484, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0382, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0857, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0498, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0784, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0410, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0653, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0903, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0501, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0455, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0634, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0502, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0672, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0580, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0763, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0521, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0495, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0562, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0847, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0417, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0537, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0419, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0565, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0754, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0684, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0512, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0434, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0702, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0912, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0555, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0462, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0461, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0403, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0830, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0609, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0426, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0665, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0608, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0449, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0849, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0813, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0594, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0422, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0819, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0843, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0769, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0565, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0554, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0924, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0663, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0600, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0410, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1419, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0878, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0555, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0970, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0428, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0424, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0584, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0624, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0592, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0518, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0643, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0685, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0709, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0571, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1607, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0550, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0639, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0491, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0874, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0711, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0537, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0667, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0656, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0497, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0643, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0892, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0385, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0408, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0706, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0429, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0504, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0554, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0508, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0385, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0807, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0544, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0778, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0692, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0504, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0683, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0708, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0473, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0467, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0762, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0694, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0520, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0566, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0738, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0566, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0420, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0588, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0430, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0593, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0499, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0684, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0469, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0878, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0722, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0618, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0841, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0538, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0647, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0502, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0434, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0578, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0503, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0506, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0473, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0419, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0559, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0470, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0553, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0437, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0640, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0590, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0428, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0428, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0384, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0790, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0740, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0489, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0556, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0528, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0710, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0694, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0561, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0777, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0539, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0391, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0529, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0581, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0658, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0601, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0592, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0868, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0805, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0516, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0500, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0888, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0457, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0657, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0904, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0588, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0736, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0561, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0489, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0446, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0408, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0509, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0462, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0618, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0507, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0443, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0516, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0702, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0460, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0557, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0560, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0433, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0419, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0524, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0533, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0520, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0572, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0444, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0387, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0445, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0969, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0410, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0430, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0405, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0599, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0594, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0399, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0394, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0527, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0556, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0525, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0931, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0530, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0420, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0800, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0462, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0463, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0426, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0468, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0468, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0547, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0599, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0479, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0690, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0614, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0855, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0527, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0589, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0652, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0491, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0412, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0458, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0536, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0474, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0575, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0839, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0826, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0543, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0573, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0412, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0502, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0593, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0469, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0468, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0510, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0424, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0925, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0682, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0648, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0472, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0708, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0556, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0658, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0447, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0596, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0519, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1024, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0886, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0616, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0417, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0763, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0602, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0727, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0576, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0724, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0629, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0411, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0483, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0620, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0723, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0555, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0629, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0547, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0636, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0591, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0700, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0633, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0679, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0576, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0619, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0604, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0766, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0621, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0396, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0529, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1021, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0668, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0721, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0388, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0458, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0541, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0396, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0389, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0513, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0666, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0673, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0645, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0714, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0383, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0472, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0633, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0838, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0630, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0511, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0540, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0555, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1005, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0393, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0629, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0535, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0588, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0404, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0427, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0600, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0590, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0487, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0499, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0397, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0422, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0651, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0391, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0452, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0585, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0953, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0379, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0654, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0564, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0533, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0747, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0684, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0565, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0662, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0518, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0780, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0673, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0788, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0594, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0508, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0615, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0476, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0632, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0879, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0402, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0635, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0634, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0709, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0942, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0496, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0506, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0467, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0424, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0804, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0456, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0849, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0837, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0832, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0746, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0390, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0603, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0462, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0432, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0768, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0429, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0954, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0629, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0469, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0530, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0492, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0786, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0559, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0694, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0451, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0539, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0384, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0392, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0633, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0881, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0383, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0463, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0456, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0766, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0600, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0420, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0557, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0419, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0454, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0485, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0532, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0409, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0786, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0439, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0463, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0509, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0462, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0766, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0647, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0478, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0538, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0424, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0615, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0701, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0618, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0690, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0548, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0459, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0967, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0975, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0582, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0539, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0399, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0654, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0397, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0733, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0406, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0400, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0417, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0440, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0527, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0587, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0674, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0546, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0947, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0780, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0398, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0484, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0981, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0687, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0431, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0694, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0624, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0444, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0496, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0703, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0411, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0629, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0577, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0533, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0626, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0422, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0384, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0549, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0444, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0952, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0414, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0960, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0394, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0508, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0525, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0425, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0603, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0468, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0606, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0519, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0669, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0505, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0652, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0510, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0882, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0537, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0641, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0607, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0503, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0846, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0879, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0430, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0421, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0526, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0562, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0551, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0450, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0397, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0553, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0389, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0479, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0455, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0473, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0532, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0563, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0611, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0561, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0500, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0700, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0540, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0466, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0451, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0765, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0588, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0391, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0506, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0390, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0464, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0633, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0831, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0449, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0499, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0489, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0534, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0622, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0800, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0760, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0503, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0762, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0633, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0806, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0486, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0525, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0467, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0735, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0698, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0610, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0702, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0449, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0678, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0723, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0383, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0485, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0751, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0667, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0579, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0399, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0651, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0877, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0467, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0428, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0489, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0562, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0457, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0621, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0465, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0678, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0487, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0472, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0480, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0524, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0803, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0425, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0437, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0822, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0694, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0675, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0515, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0581, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0589, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0790, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0446, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0450, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0548, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0514, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0570, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0559, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0769, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0521, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0686, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0839, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0556, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0659, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0668, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0387, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0814, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0527, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0740, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0415, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0426, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0629, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0834, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0430, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0483, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0736, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0485, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0389, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0562, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0543, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0455, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0441, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0853, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0543, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0944, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0541, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0687, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0551, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0614, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0559, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0771, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0639, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0755, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0506, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0411, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0686, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0648, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0632, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0565, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0649, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0636, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0463, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0573, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0527, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0424, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0463, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0476, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0711, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0559, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0417, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0590, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0894, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0418, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0386, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0586, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0539, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0483, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0383, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0642, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0454, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0469, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0535, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0789, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0831, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0423, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0447, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0618, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0410, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0832, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0483, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0520, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0793, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0597, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0450, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0486, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0606, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0699, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0574, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0423, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0424, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0718, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0469, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0541, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0627, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0590, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0585, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0518, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0892, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0846, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0701, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0467, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0389, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0590, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0398, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0507, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0447, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0840, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0456, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0458, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0468, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0548, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0479, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0418, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0414, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0579, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0463, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0517, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0947, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0578, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0584, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0624, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0566, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0412, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0699, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0898, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0834, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0633, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0616, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0632, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0388, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0773, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0931, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0460, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0499, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0573, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0587, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0737, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0514, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0493, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0459, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0711, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0496, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0789, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0467, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0626, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0671, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0561, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0488, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0463, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0642, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0389, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0396, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0568, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0549, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0487, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0495, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0484, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0525, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0601, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0583, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0542, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0464, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0429, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0853, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0684, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0504, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0443, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0862, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0424, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0543, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0558, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0469, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0734, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0384, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0950, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0715, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0827, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0731, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0546, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0741, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0545, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0898, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0395, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0486, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0715, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0653, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0588, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0599, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0465, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0709, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0753, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0759, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0483, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0613, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0899, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0607, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0957, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0478, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0688, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0723, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0415, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0379, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0417, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0519, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0614, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0463, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0549, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0563, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0385, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0710, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0459, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0385, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0510, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0483, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0488, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0476, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0533, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0718, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0472, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0568, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0387, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0629, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0725, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0393, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0942, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0606, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0901, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0552, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0760, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0592, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0446, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0416, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0580, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0398, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0543, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0777, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0484, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0668, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0495, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0550, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0391, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0490, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0379, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0603, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0677, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0529, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0801, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0405, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0585, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0536, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1023, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0716, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0525, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0420, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0429, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0449, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0830, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0555, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0519, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0883, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0507, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0476, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0468, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0410, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0554, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0452, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0495, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0894, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0928, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0522, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0391, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0523, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0602, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0469, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0508, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0422, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0421, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0522, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0383, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0992, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0598, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0482, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0383, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0420, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0520, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0431, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0651, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0505, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0432, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0547, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0465, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0645, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0773, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0419, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0449, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0679, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0492, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0681, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0546, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0387, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0545, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0527, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0460, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0390, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0483, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0401, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0775, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0394, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0495, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0479, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0511, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0428, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0389, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0485, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0791, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0701, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0468, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0476, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0551, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0408, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0982, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0434, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0505, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0513, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0432, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0628, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0553, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0522, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0510, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0404, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0559, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0445, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0806, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0720, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0403, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0580, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0556, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0570, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0732, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0499, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0385, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0417, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0548, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0815, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0397, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0435, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0586, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0495, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0628, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0579, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0446, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0428, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0635, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0413, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0474, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0852, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0404, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0671, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0464, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0518, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0833, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0651, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0485, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0599, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0542, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0506, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0852, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0509, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0501, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0672, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0454, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0609, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0385, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0468, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0694, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0386, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0527, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0416, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0518, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0541, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0672, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0628, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0658, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0547, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0464, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0409, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0383, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0424, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0397, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0452, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0502, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0479, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0675, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0504, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0481, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0396, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0420, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0662, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0396, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0557, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0446, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0450, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0520, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0458, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0505, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0434, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0465, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0573, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0655, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0406, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0921, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0570, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0686, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0698, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0481, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0536, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0386, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0726, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0402, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0449, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0746, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0494, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0931, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0534, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0659, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0459, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0496, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0547, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0668, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0432, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0426, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0434, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0425, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0526, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0486, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0511, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0412, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0602, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0427, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0464, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0439, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0452, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0640, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0483, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0872, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0422, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0901, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0444, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0577, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0521, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0467, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0433, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0454, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0532, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0580, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0459, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0601, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0660, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0402, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0437, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0453, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0725, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0420, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0548, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0493, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0609, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0803, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0521, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0551, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0506, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0546, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0608, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0537, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0812, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0583, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0519, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0742, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0860, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0435, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0613, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0710, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0677, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0528, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0519, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0390, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0531, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0547, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0588, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0588, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0550, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0442, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0417, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0730, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0415, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0506, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0683, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0720, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0570, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0477, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0498, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0522, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0496, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0428, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0525, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0446, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0475, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0419, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0543, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0406, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0672, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0726, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0452, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0497, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0409, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0560, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0439, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0597, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0502, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0661, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0613, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0400, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0379, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0379, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0510, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0770, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0652, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0580, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0749, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0625, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0609, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0386, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0546, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0504, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0609, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0605, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0612, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0759, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0406, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0537, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0881, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0398, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0608, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0590, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0457, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0644, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0661, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0601, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0847, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0649, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0668, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0508, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0388, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0454, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0762, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0550, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0733, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0455, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0680, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0491, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0450, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0568, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0752, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0428, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0539, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0669, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0515, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0659, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0486, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0853, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0683, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0389, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0687, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0879, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0479, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0459, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0465, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0725, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0494, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0648, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0402, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0548, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0432, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0457, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0565, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0452, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0478, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0577, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0390, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0626, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0519, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0409, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0590, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0457, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0425, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0393, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0640, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0404, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0392, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0547, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0397, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0438, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0400, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0560, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0852, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0401, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0406, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0411, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0645, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0466, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0411, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0515, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0501, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0413, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0591, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0449, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0409, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0438, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0387, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0739, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0566, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0516, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0421, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0499, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0506, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0517, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0470, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0742, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0618, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0832, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0389, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0572, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0730, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0409, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0404, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0561, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0733, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0482, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0494, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0430, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0526, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0451, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0849, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0586, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0405, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0542, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0598, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0529, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0495, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0543, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0524, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0391, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0525, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0608, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0635, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0656, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0404, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0464, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0598, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0623, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0555, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0532, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0423, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0421, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0521, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0524, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0658, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0453, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0571, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0751, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0387, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0452, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0471, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0576, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0661, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0505, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0557, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0738, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0416, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0973, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0533, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0791, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0438, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0726, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0492, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0675, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0439, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0419, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0640, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0546, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0398, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0474, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0473, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0815, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0552, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0417, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0461, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0420, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0505, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0595, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0709, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0570, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1012, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0528, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0524, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0674, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0543, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0655, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0524, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0476, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0518, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0426, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0486, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0710, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0518, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0751, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0387, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0568, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0525, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0513, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0493, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0904, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0406, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0618, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0525, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0472, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0505, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0642, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0382, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0501, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0453, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0709, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0559, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0385, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0701, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0541, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0425, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0470, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0443, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0472, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0467, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0901, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0549, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0404, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0566, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0513, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0468, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0635, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0379, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0606, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0399, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0427, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0520, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0398, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0434, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0523, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0699, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0436, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0406, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0442, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0521, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0697, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0433, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0485, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0382, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0396, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0496, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0586, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0481, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0616, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0559, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0414, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0556, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0491, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0478, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0476, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0654, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0424, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0478, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0549, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0605, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0539, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0388, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0604, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0717, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0455, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0528, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0770, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0554, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0402, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0627, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0525, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0548, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0501, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0505, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0453, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0573, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0447, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0453, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0585, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0493, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0426, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0496, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0557, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0459, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0388, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0904, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0492, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0597, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0407, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0420, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0421, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0714, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0706, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0901, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0571, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0466, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0544, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0591, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0679, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0521, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0407, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0608, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0555, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0629, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0638, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0379, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0731, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0500, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0766, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0542, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0574, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0610, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0384, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0456, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0473, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0812, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0546, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0777, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0458, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0713, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0483, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0511, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0436, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0517, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0685, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0463, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0394, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0480, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0624, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0823, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0541, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0590, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0528, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0516, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0628, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0780, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0459, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0425, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0471, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0613, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0798, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0709, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0601, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0702, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0540, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0601, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0429, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0495, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0572, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0419, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0560, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0558, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0396, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0455, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0595, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0591, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0500, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0633, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0961, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0427, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0443, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0479, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0546, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0556, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0947, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0538, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0437, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0489, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0643, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0472, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0471, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0597, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0441, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0471, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0459, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0653, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0445, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0392, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0488, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0492, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0613, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0555, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0545, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0857, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0467, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0588, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0557, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0407, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0437, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0442, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0601, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0457, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0843, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0424, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0942, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0453, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0605, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0633, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0468, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0392, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0481, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0507, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0740, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0387, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0690, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0571, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0476, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0778, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0560, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0576, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0610, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0455, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0732, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0425, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0409, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0521, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0728, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0476, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0823, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0676, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0586, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0395, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0434, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0698, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0630, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0682, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0635, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0573, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0434, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0384, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0495, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0544, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0469, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0503, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0453, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0405, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0534, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0727, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0395, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0632, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0489, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0682, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0545, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0616, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0420, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0507, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0601, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0419, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0508, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0396, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0481, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0504, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0470, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0402, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0450, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0449, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0390, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0776, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0382, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0537, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0387, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0396, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0707, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0453, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0394, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0582, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0508, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0414, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0648, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0464, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0382, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0547, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0734, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0508, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0810, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0491, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0398, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0419, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0455, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0383, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0457, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0435, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0531, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0404, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0535, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0440, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0506, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0385, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0449, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0477, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0449, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0399, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0783, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0571, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0639, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0687, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0527, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0517, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0653, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0432, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0510, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0429, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0429, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0442, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0870, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0510, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0642, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0487, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0665, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0651, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0420, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0385, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0597, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0584, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0452, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0666, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0418, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0482, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0453, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0521, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0650, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0405, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0622, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0462, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0554, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0724, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0417, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0424, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0388, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0578, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0528, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0409, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0383, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0510, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0604, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0561, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0450, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0510, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0930, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0464, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0411, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0630, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0476, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0459, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0391, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0393, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0389, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0399, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0482, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0895, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0404, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0531, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0588, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0688, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0619, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0590, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0526, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0579, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0469, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0542, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0403, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0463, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0397, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0664, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0542, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0431, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0400, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0804, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0389, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0493, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0754, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0413, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0704, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0408, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0515, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0489, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0576, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0439, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0475, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0461, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0396, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0794, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0420, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0514, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0651, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0483, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0418, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0436, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0659, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0488, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0815, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0568, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0564, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0580, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0512, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0762, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0530, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0692, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0391, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0417, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0720, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0397, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0577, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0626, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0736, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0516, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0566, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0442, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0487, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0434, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0516, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0653, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0434, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0601, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0553, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0661, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0427, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0521, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0680, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0659, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0530, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0453, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0446, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0454, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0627, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0387, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0507, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0485, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0402, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0496, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0737, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0460, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0556, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0442, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0610, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0523, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0591, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0422, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0567, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0645, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0451, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0429, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0612, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0472, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0786, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0546, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0648, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0394, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0462, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0415, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0507, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0573, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0437, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0911, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0471, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0403, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0423, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0421, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0422, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0387, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0394, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0829, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0685, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0716, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0383, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0569, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0400, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0434, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0395, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0484, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0475, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0487, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0740, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0481, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0607, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0512, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0618, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0533, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0400, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0487, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0481, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0528, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0431, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0379, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0744, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0401, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0413, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0520, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0429, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0486, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0667, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0406, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0516, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0440, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0394, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0537, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0511, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0448, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0398, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0442, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0485, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0498, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0739, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0519, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0585, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0482, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0548, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0904, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0500, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0528, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0422, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0404, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0398, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0461, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0413, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0396, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0461, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0438, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0703, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0430, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0480, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0470, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0514, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0389, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0503, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0496, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0483, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0455, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0536, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0491, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0435, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0403, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0471, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0465, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0471, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0606, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0385, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0489, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0693, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0613, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0452, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0841, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0404, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0451, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0450, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0448, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0416, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0562, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0408, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0422, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0565, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0494, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0426, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0391, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0628, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0395, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0458, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0617, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0453, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0458, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0521, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0566, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0391, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0420, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0439, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0453, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0406, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0411, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0531, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0401, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0452, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0486, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0493, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0507, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0418, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0789, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0564, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0404, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0622, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0522, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0481, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0719, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0459, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0396, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0426, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0563, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0394, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0456, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0415, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0426, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0407, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0465, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0536, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0472, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0816, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0435, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0764, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0586, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0493, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0416, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0645, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0438, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0620, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0763, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0455, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0455, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0654, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0531, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0475, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0426, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0417, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0595, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0397, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0811, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0473, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0521, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0472, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0578, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0425, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0392, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0515, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0451, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0386, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0409, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0721, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0582, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0425, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0565, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0448, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0543, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0393, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0418, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0401, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0528, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0430, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0497, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0415, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0466, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0512, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0379, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0415, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0495, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0565, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0519, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0413, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0504, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0515, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0426, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0398, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0508, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0918, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0813, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0645, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0507, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0645, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0551, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0570, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0471, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0419, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0387, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0400, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0422, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0429, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0550, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0431, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0413, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0508, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0491, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0528, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0443, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0385, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0553, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0458, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0495, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0438, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0396, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0484, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0551, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0404, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0391, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0568, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0684, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0499, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0510, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0535, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0498, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0470, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0534, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0545, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0398, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0545, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0639, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0527, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0426, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0584, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0443, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0453, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0403, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0646, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0465, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0406, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0650, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0483, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0512, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0445, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0531, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0535, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0441, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0446, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0496, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0528, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0476, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0447, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0450, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0591, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0415, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0422, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0688, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0540, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0382, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0390, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0545, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0558, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0684, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0438, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0518, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0450, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0563, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0480, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0721, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0681, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0386, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0396, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0487, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0686, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0564, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0462, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0464, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0570, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0406, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0388, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0404, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0462, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0468, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0405, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0445, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0502, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0379, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0409, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0420, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0420, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0394, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0469, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0442, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0526, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0588, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0717, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0537, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0426, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0423, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0512, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0625, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0433, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0402, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0517, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0679, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0402, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0425, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0382, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0443, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0789, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0383, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0585, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0458, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0382, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0402, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0462, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0397, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0633, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0738, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0442, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0429, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0411, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0463, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0384, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0606, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0443, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0419, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0392, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0636, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0397, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0423, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0710, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0473, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0532, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0514, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0398, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0514, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0464, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0492, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0734, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0556, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0541, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0566, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0450, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0541, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0475, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0533, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0479, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0543, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0517, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0787, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0421, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0401, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0495, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0413, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0504, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0528, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0739, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0397, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0664, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0584, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0413, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0499, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0383, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0640, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0412, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0603, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0443, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0430, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0452, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0480, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0473, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0450, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0498, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0445, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0499, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0480, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0383, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0427, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0401, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0612, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0615, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0419, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0496, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0501, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0493, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0607, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0560, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0654, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0631, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0851, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0453, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0453, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0449, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0554, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0425, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0662, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0517, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0413, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0386, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0686, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0568, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0404, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0512, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0479, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0552, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0468, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0395, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0454, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0401, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0584, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0436, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0418, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0402, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0460, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0507, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0571, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0024, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0453, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0484, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0394, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0394, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0392, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0382, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0555, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0536, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0455, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0409, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0466, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0397, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0411, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0388, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0428, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0552, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0421, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0470, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0396, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0456, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0474, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0395, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0594, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0462, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0406, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0548, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0436, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0571, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0588, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0401, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0612, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0423, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0598, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0419, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0394, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0406, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0422, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0466, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0394, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0386, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0393, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0421, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0768, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0619, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0638, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0514, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0387, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0666, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0595, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0786, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0738, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0504, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0504, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0382, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0451, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0703, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0448, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0412, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0683, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0432, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0424, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0536, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0434, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0482, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0557, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0384, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0639, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0438, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0967, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0927, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0461, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0572, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0494, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0674, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0481, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0456, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0485, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0397, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0496, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0501, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0379, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0463, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0654, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0459, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0500, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0430, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0393, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0536, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0439, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0391, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0404, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0552, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0637, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0597, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0627, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0440, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0403, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0473, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0686, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0602, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0551, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0537, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0820, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0415, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0473, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0511, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0471, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0529, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0417, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0395, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0431, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0469, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0422, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0397, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0497, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0646, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0467, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0464, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0762, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0806, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0532, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0453, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0422, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0674, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0383, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0551, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0422, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0439, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0427, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0393, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0593, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0464, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0581, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0433, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0015, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0410, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0438, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0429, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0643, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0410, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0524, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0496, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0393, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0416, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0459, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0396, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0459, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0382, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0499, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0384, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0460, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0414, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0384, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0419, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0397, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0650, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0541, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0465, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0586, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0481, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0573, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0637, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0415, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0422, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0628, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0477, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0449, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0462, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0665, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0436, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0425, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0668, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0532, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0471, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0856, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0499, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0487, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0420, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0403, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0562, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0411, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0411, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0463, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0436, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0453, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0688, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0531, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0534, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0464, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0421, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0436, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0420, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0470, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0414, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0586, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0400, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0572, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0442, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0471, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0424, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0451, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0405, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0809, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0470, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0477, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0516, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0460, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0009, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0487, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0500, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0458, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0471, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0022, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0559, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0447, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0443, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0531, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0657, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0382, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0631, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0441, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0413, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0531, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0398, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0391, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0451, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0588, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0491, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0653, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0449, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0431, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0395, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0481, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0718, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0416, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0524, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0730, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0501, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0392, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0406, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0513, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0413, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0624, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0407, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0423, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0402, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0406, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0476, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0488, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0503, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0441, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0815, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0941, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0456, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0679, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0679, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0561, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0455, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0503, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0394, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0453, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0388, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0475, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0435, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0447, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0441, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0491, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0653, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0425, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0541, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0489, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0433, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0571, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0529, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0394, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0418, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0540, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0720, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0514, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0497, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0515, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0454, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0416, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0407, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0509, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0426, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0488, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0485, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0514, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0478, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0536, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0663, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0493, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0382, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0395, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0494, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0567, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0457, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0465, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0398, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0400, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0463, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0499, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0410, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0436, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0627, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0529, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0407, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0414, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0435, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0420, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0485, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0411, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0436, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0399, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0385, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0414, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0500, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0492, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0724, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0451, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0615, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0388, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0441, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0572, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0543, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0670, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0535, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0398, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0505, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0386, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0451, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0384, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0657, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0416, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0493, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0413, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0418, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0500, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0379, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0527, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0382, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0542, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0404, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0424, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0453, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0405, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0429, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0461, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0410, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0451, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0486, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0388, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0603, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0395, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0608, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0493, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0798, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0732, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0473, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0408, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0548, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0478, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0467, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0518, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0484, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0441, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0391, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0459, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0496, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0503, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0709, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0612, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0382, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0441, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0454, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0460, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0389, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0390, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0408, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0484, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0393, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0448, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0388, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0435, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0455, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0459, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0611, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0444, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0379, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0412, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0401, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0698, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0622, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0480, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0475, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0489, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0441, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0388, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0406, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0519, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0496, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0497, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0499, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0591, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0415, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0396, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0383, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0620, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0459, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0518, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0573, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0395, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0449, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0417, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0386, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0486, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0450, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0606, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0596, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0840, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0428, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0548, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0538, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0453, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0470, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0389, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0485, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0585, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0475, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0382, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0554, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0554, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0531, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0445, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0562, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0598, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0454, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0574, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0529, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0458, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0556, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0421, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0493, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0490, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0494, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0401, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0440, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0618, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0427, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0466, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0404, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0463, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0588, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0457, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0455, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0752, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0465, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0401, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0452, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0426, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0483, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0523, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0379, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0394, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0400, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0759, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0393, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0514, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0562, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0645, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0519, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0497, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0462, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0475, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0484, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0549, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0619, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0399, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0396, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0419, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0399, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0398, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0462, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0547, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0523, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0389, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0405, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0437, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0422, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0412, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0449, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0388, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0541, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0749, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0392, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0428, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0400, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0402, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0525, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0460, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0457, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0475, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0453, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0564, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0478, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0549, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0436, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0510, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0543, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0436, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0536, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0467, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0484, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0471, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0392, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0423, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0400, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0542, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0464, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0457, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0403, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0546, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0438, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0411, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0474, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0421, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0538, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0448, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0464, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0527, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0445, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0622, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0396, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0482, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0428, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0399, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0442, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0388, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0457, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0382, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0611, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0466, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0417, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0526, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0409, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0400, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0389, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0455, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0425, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0393, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0384, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0569, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0532, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0474, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0460, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0385, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0405, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0405, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0420, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0588, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0417, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0589, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0491, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0421, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0419, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0402, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0655, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0584, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0530, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0415, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0506, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0405, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0501, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0730, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0566, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0008, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0450, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0508, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0601, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0711, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0545, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0570, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0455, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0419, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0489, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0519, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0400, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0703, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0379, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0430, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0572, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0439, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0538, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0384, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0392, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0441, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0415, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0418, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0432, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0505, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0472, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0408, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0544, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0640, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0385, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0387, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0586, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0576, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0384, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0391, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0543, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0617, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0591, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0405, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0392, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0414, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0524, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0515, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0607, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0686, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0504, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0435, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0442, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0557, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0551, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0466, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0524, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0564, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0584, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0916, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0397, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0543, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0400, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0528, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0457, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0557, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0749, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0458, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0553, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0435, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0417, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0523, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0509, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0405, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0405, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0397, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0408, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0437, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0417, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0383, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0007, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0458, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0533, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0507, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0388, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0478, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0543, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0503, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0606, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0688, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0425, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0451, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0382, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0507, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0415, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0514, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0411, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0382, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0569, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0432, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0496, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0383, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0411, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0525, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0652, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0421, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0002, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0455, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0680, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0385, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0628, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0412, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0581, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0792, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0511, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0706, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0420, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0514, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0477, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0572, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0460, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0415, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0520, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0538, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0600, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0506, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0482, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0403, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0461, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0566, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0480, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0498, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0379, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0393, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0528, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0418, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0442, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0578, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0712, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0730, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0630, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0606, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0435, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0654, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0492, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0493, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0673, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0419, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0427, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0420, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0544, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0597, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0400, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0417, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0516, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0690, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0430, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0422, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0624, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0410, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0492, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0513, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0391, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0506, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0905, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0416, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0599, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0405, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0388, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0437, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0389, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0785, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0510, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0653, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0416, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0410, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0734, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0592, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0486, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0509, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0537, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0410, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0419, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0438, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0616, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0447, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0406, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0482, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0432, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0386, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0428, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0546, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0424, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0494, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0395, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0532, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0710, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0747, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0395, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0578, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0526, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0480, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0392, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0486, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0485, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0454, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0383, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0642, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0484, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0706, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0425, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0462, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0426, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0605, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0510, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0408, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0529, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0388, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0441, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0490, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0618, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0413, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0417, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0393, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0457, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0585, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0413, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0409, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0444, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0587, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0430, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0390, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0427, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0760, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0530, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0475, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0675, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0628, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0536, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0490, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0461, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0495, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0476, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0434, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0394, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0929, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0482, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0419, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0487, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0432, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0459, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0390, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0384, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0503, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0412, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0432, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0432, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0545, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0455, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0405, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0602, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0382, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0382, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0517, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0460, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0407, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0537, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0400, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0763, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0510, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0392, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0552, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0531, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0576, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0417, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0384, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0464, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0483, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0608, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0474, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0582, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0434, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0498, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0413, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0471, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0401, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0439, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0739, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0456, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0471, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0567, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0396, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0401, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0409, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0405, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0386, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0483, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0518, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0408, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0557, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0581, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0572, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0490, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0394, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0681, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0488, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0512, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0465, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0449, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0502, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0435, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0731, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0424, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0465, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0580, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0492, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0440, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0487, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0391, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0627, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0473, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0419, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0431, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0421, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0444, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0394, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0454, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0528, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0426, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0398, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0391, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0529, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0599, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0421, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0427, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0421, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0445, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0485, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0388, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0428, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0706, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0553, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0572, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0568, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0413, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0428, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0511, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0379, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0566, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0481, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0677, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0612, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0467, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0541, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0403, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0520, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0411, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0553, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0397, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0402, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0455, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0493, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0395, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0608, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0520, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0401, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0407, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0386, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0417, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0462, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0784, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0729, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0424, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0388, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0677, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0532, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0491, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0403, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0604, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0651, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0416, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0834, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0433, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0567, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0612, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0412, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0417, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0786, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0534, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0540, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0490, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0456, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0421, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0403, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0419, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0396, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0504, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0667, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0437, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0465, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0434, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0503, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0439, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0630, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0465, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0652, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0546, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0023, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0447, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0721, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0469, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0571, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0588, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0398, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0469, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0399, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0577, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0429, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0576, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0532, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0430, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0391, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0643, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0398, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0025, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0425, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0389, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0473, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0533, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0409, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0490, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0438, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0398, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0449, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0530, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0391, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0491, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0501, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0383, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0625, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0455, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0386, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0535, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0396, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0805, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0412, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0408, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0415, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0792, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0442, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0509, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0541, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0596, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0484, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0388, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0388, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0505, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0560, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0543, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0396, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0415, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0418, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0440, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0502, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0420, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0513, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0531, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0414, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0485, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0433, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0448, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0486, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0477, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0641, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0429, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0491, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0596, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0486, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0573, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0491, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0440, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0474, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0407, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0436, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0393, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0507, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0624, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0773, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0459, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0446, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0401, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0618, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0419, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0458, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0450, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0397, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0514, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0519, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0575, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0437, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0637, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0444, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0566, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0515, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0529, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0589, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0519, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0421, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0560, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0467, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0507, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0553, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0522, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0503, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0424, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0455, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0530, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0549, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0599, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0427, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0498, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0574, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0439, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0390, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0443, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0470, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0421, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0382, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0419, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0451, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0405, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0392, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0453, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0514, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0395, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0446, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0497, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0395, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0550, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0636, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0487, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0463, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0709, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0652, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0662, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0649, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0396, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0434, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0562, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0789, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0429, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0578, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0383, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0387, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0419, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0429, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0665, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0472, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0396, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0662, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0635, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0419, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0384, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0478, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0465, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0929, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0433, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0420, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0417, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0451, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0475, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0402, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0390, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0391, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0510, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0417, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0451, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0429, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0525, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0426, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0514, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0399, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0424, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0432, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0448, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0630, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0434, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0503, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0413, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0436, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0391, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0384, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0493, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0522, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0898, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0401, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0652, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0504, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0614, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0389, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0583, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0456, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0629, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0459, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0423, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0392, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0553, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0454, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0467, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0652, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0665, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0437, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0396, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0408, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0688, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0453, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0478, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0461, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0395, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0458, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0423, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0816, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0398, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0484, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0408, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0497, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0508, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0427, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0417, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0013, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0413, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0395, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0541, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0392, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0530, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0598, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0433, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0423, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0598, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0579, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0583, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0498, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0438, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0481, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0653, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0437, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0484, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0462, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0441, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0759, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0665, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0405, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0450, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0415, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0517, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0505, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0546, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0492, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0821, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0434, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0443, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0652, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0568, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0407, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0443, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0496, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0444, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0520, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0513, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0598, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0743, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0405, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0395, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0424, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0570, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0400, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0414, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0438, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0424, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0390, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0437, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0883, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0683, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0560, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0620, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0404, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0632, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0451, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0397, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0384, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0520, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0410, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0499, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0384, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0428, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0423, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0557, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0480, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0423, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0538, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0510, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0449, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0448, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0402, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0397, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0516, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0025, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0909, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0412, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0600, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0440, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0403, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0733, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0965, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0470, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0409, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0493, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0531, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0421, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0566, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0017, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0395, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0441, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0430, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0462, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0379, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0446, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0807, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0469, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0426, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0404, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0545, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0648, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0957, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0413, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0483, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0476, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0379, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0405, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0423, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0413, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0439, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0451, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0410, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0416, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0401, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0510, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0454, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0454, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0455, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0484, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0412, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0422, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0399, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0506, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0403, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0385, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0463, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0406, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0560, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0421, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0389, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0535, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0495, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0552, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0579, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0433, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0410, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0397, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0386, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0413, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0474, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0424, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0425, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0426, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0400, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0507, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0787, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0383, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0013, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0545, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0436, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0596, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0441, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0016, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0390, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0387, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0409, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0433, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0591, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0439, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0606, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0577, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0909, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0383, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0443, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0414, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0582, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0383, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0479, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0545, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0620, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0701, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0410, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0427, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0713, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0441, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0425, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0691, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0520, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0400, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0960, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0384, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0476, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0399, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0628, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0406, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0391, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0442, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0516, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0462, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0382, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0512, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0473, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0565, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0432, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0556, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0439, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0432, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0443, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0793, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0395, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0440, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0459, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0638, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0403, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0463, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0432, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0546, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0478, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0628, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0741, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0486, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0619, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0414, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0428, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0470, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0455, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0809, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0454, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0403, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0431, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0651, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0502, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0457, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0508, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0589, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0744, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0467, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0521, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0391, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0699, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0558, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0425, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0394, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0463, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0505, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0440, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0510, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0388, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0436, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0603, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0467, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0578, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0475, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0013, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0379, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0848, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0395, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0627, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0510, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0639, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0386, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0621, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0644, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0415, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0519, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0798, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0587, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0436, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0577, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0023, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0443, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0489, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0577, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0430, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0387, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0492, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0596, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0472, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0543, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0441, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0653, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0475, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0630, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0429, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0407, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0402, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0402, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0446, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0383, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0637, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0425, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0401, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0618, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0504, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0532, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0572, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0396, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0760, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0382, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0397, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0424, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0444, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0532, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0494, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0416, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0411, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0429, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0529, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0401, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0606, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0623, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0426, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0445, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0021, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0394, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0885, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0583, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0396, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0726, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0532, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0547, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0879, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0400, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0413, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0431, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0421, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0630, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0868, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0553, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0541, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0510, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0552, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0682, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0561, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0420, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0769, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0418, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0447, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0433, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0398, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0611, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0437, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0396, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0387, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0416, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0385, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0423, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0417, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0509, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0416, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0471, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0418, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0547, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0513, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0442, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0417, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0735, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0585, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0467, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0401, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0693, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0996, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0454, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0525, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0502, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0426, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0469, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0400, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0439, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0459, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0543, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0386, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0705, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0409, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0427, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0392, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0392, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0479, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0504, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0544, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0506, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0415, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0525, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0726, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0412, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0423, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0418, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0421, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0747, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0400, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0547, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0830, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0505, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0741, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0385, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0453, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0434, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0818, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0606, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0541, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0560, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0542, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0661, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0576, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0643, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0496, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0529, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0890, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0385, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0643, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0506, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0389, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0504, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0430, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0447, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0455, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0531, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0388, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0441, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0542, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0389, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0495, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0498, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0440, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0789, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0405, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0774, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0426, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0532, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0491, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0592, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0509, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0451, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0396, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0447, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0414, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0025, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0412, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0421, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0460, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0474, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0383, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0418, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0506, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0477, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0484, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0481, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0538, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0426, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0393, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0563, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0470, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0444, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0384, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0403, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0431, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0512, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0382, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0644, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0456, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0387, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0516, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0021, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0384, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0616, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0642, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0398, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0431, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0621, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0398, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0387, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0385, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0480, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0413, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0471, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0535, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0533, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0021, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0434, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0432, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0544, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0893, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0430, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0385, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0539, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0591, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0479, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0387, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0453, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0543, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0472, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0444, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0430, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0443, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0390, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0470, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0392, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0396, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0399, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0489, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0415, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0455, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0395, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0482, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0387, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0418, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0886, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0458, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0479, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0397, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0402, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0485, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0538, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0492, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0382, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0419, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0472, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0571, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0697, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0405, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0482, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0516, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0530, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0397, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0506, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0426, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0379, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0470, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0591, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0549, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0467, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0639, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0481, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0496, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0450, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0993, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0459, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0391, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0526, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0469, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0481, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0419, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0876, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0438, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0402, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0516, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0419, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0506, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0393, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0521, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0497, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0531, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0477, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0616, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0394, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0387, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0403, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0379, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0476, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0513, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0442, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0619, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0628, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0411, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0524, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0580, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0383, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0391, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0866, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0385, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0531, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0869, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0405, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0554, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0391, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0431, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0446, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0441, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0390, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0438, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0501, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0432, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0407, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0428, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0454, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0400, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0418, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0461, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0768, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0809, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0399, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0532, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0426, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0505, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0446, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0019, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0401, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0409, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0412, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0851, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0623, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0431, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0438, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1007, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0409, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0899, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0479, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0469, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0379, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0440, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0760, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0534, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0547, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0690, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0558, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0479, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0404, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0398, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0397, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0501, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0447, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0753, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0593, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0399, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0654, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0454, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0534, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0660, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0493, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0443, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0405, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0437, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0656, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0483, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0447, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0435, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0475, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0492, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0459, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0425, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0389, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0401, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0870, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0666, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0470, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0402, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0385, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0446, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0456, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0697, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0427, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0654, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0444, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0413, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0495, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0943, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0397, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0544, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0391, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0494, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0457, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0410, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0431, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0412, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0408, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0639, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0459, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0634, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0546, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0439, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0424, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0396, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0467, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0404, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0649, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0448, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0431, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0596, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0404, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0456, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0541, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0704, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0621, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0393, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0425, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0404, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0437, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0737, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0468, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0388, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0603, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0434, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0408, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0752, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0475, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0411, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0447, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0442, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0578, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0427, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0446, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0447, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0413, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0020, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0730, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0539, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0436, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0685, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0432, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0438, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0389, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0385, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0473, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0387, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0910, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0435, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0887, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0436, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0450, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0427, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0469, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0384, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0447, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0468, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0627, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0503, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0419, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0379, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0399, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0476, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0454, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0447, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0522, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0562, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0389, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0476, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0538, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0586, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0438, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0749, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0412, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0386, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0394, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0020, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0386, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0499, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0402, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0885, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0489, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0571, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0516, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0424, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0473, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0393, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0390, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0451, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0439, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0463, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0610, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0401, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0404, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0501, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0837, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0497, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0529, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0524, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0379, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0477, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0018, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0524, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0553, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0425, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0582, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0399, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0379, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0684, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0517, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0514, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0720, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0477, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0601, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0461, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0509, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0967, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0391, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0554, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0399, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0424, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0557, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0455, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0460, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0431, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0541, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0450, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0442, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0509, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0394, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0385, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0443, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0420, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0396, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0470, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0396, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0410, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0541, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0571, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0526, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0493, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0517, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0414, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0430, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0440, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0446, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0432, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0637, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0767, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0445, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0716, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0384, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0627, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0525, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0535, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0614, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0390, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0468, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0570, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0559, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0455, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0440, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0472, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0499, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0445, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0452, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0527, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0715, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0539, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0382, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0450, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0546, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0509, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0499, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0592, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0431, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0442, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0383, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0434, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0458, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0511, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0392, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0410, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0014, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0628, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0409, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0483, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0406, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0659, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0920, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0549, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0393, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0501, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0399, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0433, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0389, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0675, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0411, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0386, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0405, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0445, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0399, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0570, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0422, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0018, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0487, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0644, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0455, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0645, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0566, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0434, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0457, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0918, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0650, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0595, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0449, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0383, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0515, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0577, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0467, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0461, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0715, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0529, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0501, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0404, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0794, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0585, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0400, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0558, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0395, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0457, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0388, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0420, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0388, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0448, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0558, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0582, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0595, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0445, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0386, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0412, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0517, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0403, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0567, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0707, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0722, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0536, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0615, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0388, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0384, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0504, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0580, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0393, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0482, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0403, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0389, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0814, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0479, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0385, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0445, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0524, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0396, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0448, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0571, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0393, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0509, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0464, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0502, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0442, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0524, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0587, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0408, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0586, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0758, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0457, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0459, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0681, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0522, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0442, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0400, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0473, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0490, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0387, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0723, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0490, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0401, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0418, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0019, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0517, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0484, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0018, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0538, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0591, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0642, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0508, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0406, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0536, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0615, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0528, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0405, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0403, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0587, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0474, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0450, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0411, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0678, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0675, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0561, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0430, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0787, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0493, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0419, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0417, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0415, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0547, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0476, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0575, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0478, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0605, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0437, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0385, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0414, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0561, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0467, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0393, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0482, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0466, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0556, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0491, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0530, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0400, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0499, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0403, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0407, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0595, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0525, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0418, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0446, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0571, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0463, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0533, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0668, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0404, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0515, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0480, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0570, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0448, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0518, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0489, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0791, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0633, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0416, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0654, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0438, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0461, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0416, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0551, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0454, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0629, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0563, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0622, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0433, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0469, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0485, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0379, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0638, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0594, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0409, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0587, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0473, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0384, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0414, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0451, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0612, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0463, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0542, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0384, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0521, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0689, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0383, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0547, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0389, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0456, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0570, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0456, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0391, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0498, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0958, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0435, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0840, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0425, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0662, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0407, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0408, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0390, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0637, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0591, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0408, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0415, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0408, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0461, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0523, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0448, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0392, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0386, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0498, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0408, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0446, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0592, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0570, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0431, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0525, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0523, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0598, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0464, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0551, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0415, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0487, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0401, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0662, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0684, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0431, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0695, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0448, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0389, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0509, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0384, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0414, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0459, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0492, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0400, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0475, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0439, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0603, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0379, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0409, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0494, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0410, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0582, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0418, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0429, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0430, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0384, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0493, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0510, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0421, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0607, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0479, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0479, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0422, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0536, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0398, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0417, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0549, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0397, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0673, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0388, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0448, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0502, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0437, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0536, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0503, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0408, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0489, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0405, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0490, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0392, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0518, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0394, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0646, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0422, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0417, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0570, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0699, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0403, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0720, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0021, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0459, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0506, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0515, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0433, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0433, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0657, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0489, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0423, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0593, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0480, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0449, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0444, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0453, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0461, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0429, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0427, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0409, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0475, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0470, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0407, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0600, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0424, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0406, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0390, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0524, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0432, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0695, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0414, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0506, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0583, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0395, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0511, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0446, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0387, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0491, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0385, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0722, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0558, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0411, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0455, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0456, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0427, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0400, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0463, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0390, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0471, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0808, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0585, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0435, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0499, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0385, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0447, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0458, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0466, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0457, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0481, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0379, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0460, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0387, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0387, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0404, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0470, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0467, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0477, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0451, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0511, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0526, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0630, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0479, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0479, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0408, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0392, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0465, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0390, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0488, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0391, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0411, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0451, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0549, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0415, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0459, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0426, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0506, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0442, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0401, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0514, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0400, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0551, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0528, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0572, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0529, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0385, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0533, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0473, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0382, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0404, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0447, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0390, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0407, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0457, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0483, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0440, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0654, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0405, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0384, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0452, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0515, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0468, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0576, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0388, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0516, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0997, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0425, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0479, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0424, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0480, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0414, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0386, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0399, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0443, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0435, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0441, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0469, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0400, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0432, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0382, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0475, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0510, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0447, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0667, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0410, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0399, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0492, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0532, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0408, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0596, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0890, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0464, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0479, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0472, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0504, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0399, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0557, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0449, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0433, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0388, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0627, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0435, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0771, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0396, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0489, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0389, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0417, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0398, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0455, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0580, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0609, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0477, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0534, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0437, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0416, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0503, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0626, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0465, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0538, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0507, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0497, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0619, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0630, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0386, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0422, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0508, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0455, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0431, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0397, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0503, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0412, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0521, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0384, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0386, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0388, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0516, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0379, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0411, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0443, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0432, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0524, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0485, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0862, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0724, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0396, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0394, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0391, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0420, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0516, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0399, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0500, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0422, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0467, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0389, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0518, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0383, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0469, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0413, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0722, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0516, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0423, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0580, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0485, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0494, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0387, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0570, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0404, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0470, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0497, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0453, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0395, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0467, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0401, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0379, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0536, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0483, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0432, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0468, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0600, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0383, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0389, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0410, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0392, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0428, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0512, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0494, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0525, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0469, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0437, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0405, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0404, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0382, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0438, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0519, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0546, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0488, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0614, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0584, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0566, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0404, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0415, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0551, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0512, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0493, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0583, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0415, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0443, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0477, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0496, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0578, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0470, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0542, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0447, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0399, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0401, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0644, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0505, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0456, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0426, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0443, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0488, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0429, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0462, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0397, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0020, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0392, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0429, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0424, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0407, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0389, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0547, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0395, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0389, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0417, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0526, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0431, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0507, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0448, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0510, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0465, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0401, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0481, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0449, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0454, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0466, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0692, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0488, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0624, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0398, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0454, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0387, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0382, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0602, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0743, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0383, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0404, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0399, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0473, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0513, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0521, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0448, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0516, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0708, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0582, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0497, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0425, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0459, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0457, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0383, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0443, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0440, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0443, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0777, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0398, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0418, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0518, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0709, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0563, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0384, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0613, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0412, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0474, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0413, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0476, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0914, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0441, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0479, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0394, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0396, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0408, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0416, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0401, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0483, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0453, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0578, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0720, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0415, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0390, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0408, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0509, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0502, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0467, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0466, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0448, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0488, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0510, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0407, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0962, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0441, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0498, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0590, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0505, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0561, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0524, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0387, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0700, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0402, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0588, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0417, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0800, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0419, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0431, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0596, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0642, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0467, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0438, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0466, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0421, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0394, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0454, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0833, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0799, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0408, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0399, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0527, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0528, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0403, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0790, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0386, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0390, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0415, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0612, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0411, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0557, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0531, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0394, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0590, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0486, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0529, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0575, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0466, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0400, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0584, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0414, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0491, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0513, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0633, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0747, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0615, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0385, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0567, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0561, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0397, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0754, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0379, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0640, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1408, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0446, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0445, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0611, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0476, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0385, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0502, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0636, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0463, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0489, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0391, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0473, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0558, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0471, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0448, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0414, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0387, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0440, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0528, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0486, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0545, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0666, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0423, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0570, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0485, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0452, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0515, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0499, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0406, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0411, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0590, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0515, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0500, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0429, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0379, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0405, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0391, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0442, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0490, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0425, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0448, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0460, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0503, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0488, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0468, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0501, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0437, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0403, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0421, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0848, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0435, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0443, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0471, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0565, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0610, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0416, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0555, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0400, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0443, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0511, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0554, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0576, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0420, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0468, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0479, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0384, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0561, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0480, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0497, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0545, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0411, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0409, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0416, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0446, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0519, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0452, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0494, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0394, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0555, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0562, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0389, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0421, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0653, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0412, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0732, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0408, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0488, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0481, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0396, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0428, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0592, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0395, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0554, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0443, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0494, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0384, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0451, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0629, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0580, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0479, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0473, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0573, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0428, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0648, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0395, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0407, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0402, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0400, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0434, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0525, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0386, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0414, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0386, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0426, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0445, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0383, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0432, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0462, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0429, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0419, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0439, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0434, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0418, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0417, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0616, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0453, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0382, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0426, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0428, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0419, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0534, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0439, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0457, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0407, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0452, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0469, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0383, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0669, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0481, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0533, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0454, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0680, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0515, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0465, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0394, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0502, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0430, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0558, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0467, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0401, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0490, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0383, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0408, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0482, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0501, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0536, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0558, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0417, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0534, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0636, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0580, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0409, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0482, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0430, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0454, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0401, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0471, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0484, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0410, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0491, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0535, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0382, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0411, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0614, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0444, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0384, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0432, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0486, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0575, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0424, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0418, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0689, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0571, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0449, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0475, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0447, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0535, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0571, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0457, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0382, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0385, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0422, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0464, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0430, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0540, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0519, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0402, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0647, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0423, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0571, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0417, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0700, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0449, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0395, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0648, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0603, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0544, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0424, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0444, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0680, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0558, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0485, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0539, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0490, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0419, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0560, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0439, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0393, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0573, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0458, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0632, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0711, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0466, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0776, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0619, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0476, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0395, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0399, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0455, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0520, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0391, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0591, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0678, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0411, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0502, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0018, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0524, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0490, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0572, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0487, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0397, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0464, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0497, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0384, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0384, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0464, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0629, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0495, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0482, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0402, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0483, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0435, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0478, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0379, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0469, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0421, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0559, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0400, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0498, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0707, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0382, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0496, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0434, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0418, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0461, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0532, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0409, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0469, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0696, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0697, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0409, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0423, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0388, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0421, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0409, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0385, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0406, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0505, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0492, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0403, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0614, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0435, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0390, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0561, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0611, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0565, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0700, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0496, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0449, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0473, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0462, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0693, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0405, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0517, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0490, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0386, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0385, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0505, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0533, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0635, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0416, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0578, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0417, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0392, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0458, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0610, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0443, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0737, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0697, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0428, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0406, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0458, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0406, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0562, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0431, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0429, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0633, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0585, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0499, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0712, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0383, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0503, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0379, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0580, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0435, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0529, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0466, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0389, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0388, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0401, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0418, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0453, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0394, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0678, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0489, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0442, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0409, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0554, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0472, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0402, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0425, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0511, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0479, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0461, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0480, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0667, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0652, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0805, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0419, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0479, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0667, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0470, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0392, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0411, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0749, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0574, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0521, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0466, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0402, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0434, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0406, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0518, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0529, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0401, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0546, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0439, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0896, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0391, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0701, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0408, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0766, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0800, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0554, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0468, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0410, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0480, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0469, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0423, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0410, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0456, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0551, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0391, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0389, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0383, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0607, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0578, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0549, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0487, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0402, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0562, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0390, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0609, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0402, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0662, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0478, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0443, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0419, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0382, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0527, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0508, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0509, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0553, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0576, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0417, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0725, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0501, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0440, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0512, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0707, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0384, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0420, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0562, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0618, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0472, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0412, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0456, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0490, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0459, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0575, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0518, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0447, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0598, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0425, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0470, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0445, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0379, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0404, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0544, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0490, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0420, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0446, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0386, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0483, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0419, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0426, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0396, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0454, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0589, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0385, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0397, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0492, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0503, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0546, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0418, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0588, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0553, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0538, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0540, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0423, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0411, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0486, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0496, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0409, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0397, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0509, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0426, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0379, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0533, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0434, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0435, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0475, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0971, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0397, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0436, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0542, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0455, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0406, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0727, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0456, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0492, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0392, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0630, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0383, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0418, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0494, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0472, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0450, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0388, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0606, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0392, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0387, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0604, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0681, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0405, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0599, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0401, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1020, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0462, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0515, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0497, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0729, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0557, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0008, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0018, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0402, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0435, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0507, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0442, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0583, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0480, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0393, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0695, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0592, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0575, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0379, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0767, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0409, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0438, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0521, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0534, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0457, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0564, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0691, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0463, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0382, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0415, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0391, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0434, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0413, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0433, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0481, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0514, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0390, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0504, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0422, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0382, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0437, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0453, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0463, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0388, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0458, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0477, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0532, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0593, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0555, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0454, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0476, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0396, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0387, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0421, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0463, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0403, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0568, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0524, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0502, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0655, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0445, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0578, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0393, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0485, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0379, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0433, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0406, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0393, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0018, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0483, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0392, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0475, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0398, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0399, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0473, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0392, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0467, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0447, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0450, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0413, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0399, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0479, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0714, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0467, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0727, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0383, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0731, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0543, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0501, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0686, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0437, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0400, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0436, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0525, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0529, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0522, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0015, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0538, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0416, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0400, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0385, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0416, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0668, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0382, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0429, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0407, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0483, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0017, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0390, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0520, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0701, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0450, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0421, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0535, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0417, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0391, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0414, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0441, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0441, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0409, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0390, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0509, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0434, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0472, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0547, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0021, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0566, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0657, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0002, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0499, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0694, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0476, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0437, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0534, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0605, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0503, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0379, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0509, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0508, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0384, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0420, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0379, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0417, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0477, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0398, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0530, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0464, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0674, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0399, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0460, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0411, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0590, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0458, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0912, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0394, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0620, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0448, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0546, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0438, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0497, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0425, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0399, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0382, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0465, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0449, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0384, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0444, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0379, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0485, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0530, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0420, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0542, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0388, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0379, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0561, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0391, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0466, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0469, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0464, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0002, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0413, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0002, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0005, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0420, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0010, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0505, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0386, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0733, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0676, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0733, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0499, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0399, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0557, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0434, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0400, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0582, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0551, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0629, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0408, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0443, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0474, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0515, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0010, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0493, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0528, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0572, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0405, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0442, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0443, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0458, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0428, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0458, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0433, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0889, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0431, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0462, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0385, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0398, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0001, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0467, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0396, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0425, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0408, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0422, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0573, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0426, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0584, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0015, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0430, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0001, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0024, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0017, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0471, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0751, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0459, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0801, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0500, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0482, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0383, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0379, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0421, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0486, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0452, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0640, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0686, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0566, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0499, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0598, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0409, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0001, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0425, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0390, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0499, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0530, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0433, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0751, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0017, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0418, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0399, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0484, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0001, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0502, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0619, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0542, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0401, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0715, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0001, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0021, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0885, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0454, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0471, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0671, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0448, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0529, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0008, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0621, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0865, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0527, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0470, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0458, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0385, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0539, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0396, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0450, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0649, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0014, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0437, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0440, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0406, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0547, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0396, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0442, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0588, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0393, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0001, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0001, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0464, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0383, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0390, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0387, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0427, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0394, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0465, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0464, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0511, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0415, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0395, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0497, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0721, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0379, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0491, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0017, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0001, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0001, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0001, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0009, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0801, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0562, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0001, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0001, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0010, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0458, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0473, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0391, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0582, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0629, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0557, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0536, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0462, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0529, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0481, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0488, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0404, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0435, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0647, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0705, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0485, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0442, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0416, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0700, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0001, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0394, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0010, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0001, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0462, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0007, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0001, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0580, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0009, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0396, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0439, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0869, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0016, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0738, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0420, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0412, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0435, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0400, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0399, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0714, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0001, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0428, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0727, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0004, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0001, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0001, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0020, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0012, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0001, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0003, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0624, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0001, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0009, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0006, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0712, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0559, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0440, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0435, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0404, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0429, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0532, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0502, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0445, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0404, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0411, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0531, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0593, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0497, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0405, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0488, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0661, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0446, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0445, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0379, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0621, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0573, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0433, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0425, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0490, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0565, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0394, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0414, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0410, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0609, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0492, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0401, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0425, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0444, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0988, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0601, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0382, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0444, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0413, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0416, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0383, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0525, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0587, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0388, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0422, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0418, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0501, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0508, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0017, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0001, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0025, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0021, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0568, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0001, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0001, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0482, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0393, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0431, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0388, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0613, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0429, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0551, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0631, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0419, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0447, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0446, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0480, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0642, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0407, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0638, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0427, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0559, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0386, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0513, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0504, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0421, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0412, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0565, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0387, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0430, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0568, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0483, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0002, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0421, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0466, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0529, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0498, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0004, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0470, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0532, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0523, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0416, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0400, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0693, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0558, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0426, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0712, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0995, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0590, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0438, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0014, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0463, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0409, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0460, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0515, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0637, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0618, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0382, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0429, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0617, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0384, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0430, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0490, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0425, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0411, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0534, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0533, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0531, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0513, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0407, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0400, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0802, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0790, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0626, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0390, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0383, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0476, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0382, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0625, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0515, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0407, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0478, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0643, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0649, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0441, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0390, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0389, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0500, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0451, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0465, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0498, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0512, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0387, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0509, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0423, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0391, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0552, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0639, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0735, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0443, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0436, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0394, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0485, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0405, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0591, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0412, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0421, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0488, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0498, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0010, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0402, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0479, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0405, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0587, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0567, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0547, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0569, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0403, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0396, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0400, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0464, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0588, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0403, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0474, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0481, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0017, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0504, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0531, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0387, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0525, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0484, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0566, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0411, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0509, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0426, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0713, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0405, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0511, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0580, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0477, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0791, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0624, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0406, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0528, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0507, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0793, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0024, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0393, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0433, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0445, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0477, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0393, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0695, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0593, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0383, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0618, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0574, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0691, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0530, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0775, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0400, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0432, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0382, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0389, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0518, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0524, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0440, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0620, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0411, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0767, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0434, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0392, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0416, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0525, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0466, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0498, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0386, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0382, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0477, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0582, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0408, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0412, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0560, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0449, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0482, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0434, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0489, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0650, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0396, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0589, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0384, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0403, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0415, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0395, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0543, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0403, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0422, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0550, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0955, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0404, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0625, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0394, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0442, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0415, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0391, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0441, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0537, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0757, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0912, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0540, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0403, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0402, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0446, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0574, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0574, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0511, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0439, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0466, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0622, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0505, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0445, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0413, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0427, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0460, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0411, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0018, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0488, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0702, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0570, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0383, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0541, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0683, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0611, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0509, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0398, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0459, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0500, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0389, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0628, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0398, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0495, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0586, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0404, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0680, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0503, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0469, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0481, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0562, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0562, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0616, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0400, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0379, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0383, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0491, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0438, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0457, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0474, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0391, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0382, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0894, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0692, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0505, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0422, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0622, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0576, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0643, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0579, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0583, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0385, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0447, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0807, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0549, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0406, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0472, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0384, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0542, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0544, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0440, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0410, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0508, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0490, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0572, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0453, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0639, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0412, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0449, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0586, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0606, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0429, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0469, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0436, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0455, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0395, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0752, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0566, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0491, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0428, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0398, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0545, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0459, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0544, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0575, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0393, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0379, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0020, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0521, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0492, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0416, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0532, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0001, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0448, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0530, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0443, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0006, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0438, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0397, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0452, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0392, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0766, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0968, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0416, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0444, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0500, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0390, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0405, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0564, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0395, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0390, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0525, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0627, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0437, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0600, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0391, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0412, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0411, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0580, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0541, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0423, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0398, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0024, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0008, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0010, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0596, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0398, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0426, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0433, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0533, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0540, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0510, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0609, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0400, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0581, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0718, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0461, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0686, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0525, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0593, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0524, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0503, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0452, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0401, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0417, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0451, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0408, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0447, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0461, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0455, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0604, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0532, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0402, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0477, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0542, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0006, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0021, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0463, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(6.3427e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0398, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0453, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0419, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0514, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0494, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0382, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0600, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0557, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0391, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0432, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0406, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0455, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0484, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0450, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0407, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0402, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0853, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0417, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0488, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0446, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0431, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0672, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0431, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0481, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0896, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0494, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0419, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0651, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0413, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0428, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0496, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0946, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0666, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0551, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0008, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0401, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0019, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0465, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0530, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0383, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0464, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0491, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0458, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0503, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0547, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0653, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0441, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0383, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0441, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0581, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0419, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0791, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0454, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0407, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0534, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0444, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0754, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0554, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0484, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0431, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0391, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0600, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0400, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0737, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0476, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0003, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0388, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0497, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(5.6344e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(5.4871e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0007, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0011, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0692, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0513, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0731, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0602, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0010, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0016, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0558, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0386, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0548, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0472, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0470, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0433, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0552, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0420, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0583, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0479, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0523, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0480, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0509, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0436, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0442, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0023, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0468, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0434, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0636, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0453, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0427, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0538, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0447, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0445, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0471, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0412, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0394, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0463, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0691, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0473, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0427, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0549, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0406, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0457, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0416, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0401, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0387, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0422, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0421, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0462, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0642, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(6.2127e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0484, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0403, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0437, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0690, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0803, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0553, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0446, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0433, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0419, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0454, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0480, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0493, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0501, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0412, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0385, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0396, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0508, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0470, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0401, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0451, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0597, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0541, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0485, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0441, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0470, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0544, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0512, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0612, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0481, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0571, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0482, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0546, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0429, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0484, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0617, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0595, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0510, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0480, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0455, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0659, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0427, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0456, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0015, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0016, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0382, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0399, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0583, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(6.3899e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0019, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0578, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0544, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0422, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0597, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0466, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0880, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0445, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0609, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0437, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0436, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0396, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0472, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0462, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0495, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0809, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0412, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0697, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0480, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0393, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0401, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0727, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0561, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0500, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0400, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0455, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0379, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0406, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0498, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0408, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0415, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0519, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0842, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0422, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0417, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0598, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0489, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0379, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0652, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0432, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0392, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0418, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0391, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0385, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0470, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0405, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0393, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0394, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0394, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0442, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0385, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0569, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0424, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0502, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0422, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0482, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0392, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0468, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0430, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0414, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0461, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0567, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0754, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0481, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0446, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0716, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0386, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0582, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0553, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0537, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0507, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0386, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0471, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0627, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0435, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0499, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0541, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0476, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0521, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0456, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0410, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0386, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0663, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0465, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0423, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0567, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0403, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0396, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0497, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0650, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0434, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0508, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0480, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0429, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0840, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0449, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0468, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0379, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0408, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0446, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0390, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0424, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0490, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0437, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0549, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0393, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0384, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0659, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0467, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0406, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0535, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0018, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0453, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0427, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0385, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0382, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0387, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0659, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0386, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0404, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0476, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0446, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0550, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0776, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0602, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0426, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0408, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0630, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0702, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0438, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0508, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0465, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0442, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0487, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0613, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0485, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0384, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0586, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0382, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0458, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0394, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0414, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0466, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0402, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0428, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0418, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0395, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0410, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0485, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0420, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0411, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0617, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0505, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0469, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0439, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0462, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0003, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0549, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0409, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0435, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0404, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0457, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0655, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0390, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0449, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0406, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0402, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0382, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0550, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0467, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0468, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0002, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0494, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0411, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0443, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0436, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0407, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0405, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0472, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0969, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0452, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0404, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0407, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0527, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0398, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0448, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0398, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0506, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0484, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0433, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0600, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0598, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0392, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0386, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0446, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0571, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0447, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0397, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0472, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0423, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0002, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0417, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0409, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0445, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0712, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0964, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0391, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0412, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0609, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0426, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0665, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0441, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0397, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0491, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0003, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0008, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0012, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0745, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0590, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0389, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0608, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0426, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0512, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0384, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0420, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0394, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0407, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0789, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0408, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0488, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0414, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0652, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0466, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0539, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0470, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0491, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0715, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0399, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0497, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0384, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0428, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0412, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0581, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0469, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0430, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0616, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0490, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0476, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0405, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0465, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0504, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0390, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0482, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0444, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0628, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0526, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0519, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0818, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0508, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0546, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0583, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0449, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0435, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0414, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0438, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0520, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0424, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0410, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0396, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0382, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0406, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0512, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0589, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0435, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0445, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0433, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0627, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0393, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0382, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0452, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0504, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0394, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0474, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0422, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0629, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0444, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0419, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0435, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0460, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0963, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0407, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0012, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0387, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0421, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0472, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0402, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0379, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0459, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0419, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0394, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0505, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0448, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0491, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0020, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0002, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0668, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0565, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0022, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0019, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0401, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0405, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0489, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0495, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0507, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0388, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0424, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0463, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0515, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0466, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0879, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0403, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0412, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0384, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0480, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0452, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0411, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0432, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0609, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0401, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0522, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0396, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0545, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0434, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0453, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0467, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0424, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0498, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0541, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0494, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0505, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0928, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0023, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0428, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0975, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0523, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0392, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0425, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0591, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0442, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0403, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0397, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0023, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0695, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0004, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0395, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0019, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0479, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0433, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0382, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0745, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0492, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0554, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0547, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0506, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0382, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0489, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0652, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0432, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0454, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0510, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0572, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0615, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0601, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0467, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0392, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0524, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0418, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0536, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0522, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0413, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0388, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0379, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0379, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0415, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0423, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0407, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0407, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0422, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0002, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0002, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0503, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0454, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0558, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0727, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0689, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0384, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0427, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0401, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0588, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0431, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0428, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0502, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0547, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0808, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0023, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0022, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0426, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0464, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0482, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0643, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0498, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0686, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0453, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0483, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0445, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0469, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0464, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0417, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0399, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0463, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0573, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0478, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0459, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0449, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0383, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0390, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0433, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0392, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0499, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0669, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0384, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0413, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0438, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0022, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0582, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0498, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0429, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0406, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0412, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0501, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0474, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0434, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0447, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0570, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0453, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0439, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0636, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0455, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0420, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0446, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0483, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0467, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0409, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0575, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0415, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0485, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0548, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0471, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0516, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0422, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0614, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0791, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0495, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0014, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0634, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0426, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0554, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0791, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0441, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0417, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0397, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0445, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0486, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0445, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0584, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0404, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0442, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0416, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0578, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0397, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0430, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0681, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0411, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0486, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0763, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0386, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0460, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0412, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0985, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0563, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0391, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0456, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0564, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0725, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0448, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0655, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0450, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0479, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0428, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0733, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0491, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0508, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0422, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0508, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0388, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0429, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0403, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0689, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0425, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0404, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0475, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0405, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0514, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0409, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0454, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0384, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0388, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0433, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0439, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0467, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0391, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0443, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0538, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0446, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0399, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0448, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0394, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0776, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0492, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0508, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0482, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0474, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0568, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0406, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0670, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0490, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0592, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0770, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0446, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0385, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0383, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0426, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0543, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0393, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0555, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0409, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0521, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0736, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0467, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0480, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0388, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0392, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0535, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0484, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0427, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0509, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0432, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0618, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0388, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0450, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0385, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0616, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0534, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0024, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0880, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0557, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0915, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0387, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0411, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0449, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0440, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0412, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0451, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0453, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0398, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0551, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0524, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0509, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0405, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0461, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0386, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0567, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0623, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0440, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0575, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0704, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0994, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0394, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0649, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0386, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0517, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0459, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0468, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0792, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0472, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0632, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0436, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0443, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0418, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0544, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0770, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0482, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0523, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0700, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0574, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0392, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0449, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0554, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0507, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0401, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0439, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0412, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0383, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0477, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0819, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0428, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0463, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0655, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0389, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0392, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0432, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0513, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0466, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0726, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0459, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0393, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0453, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0438, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0488, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0436, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0538, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0454, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0432, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0572, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0617, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0430, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0402, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0457, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0387, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0536, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0392, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0604, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0399, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0498, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0416, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0554, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0516, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0458, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0503, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0509, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0553, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0680, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0407, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0403, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0638, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0431, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0431, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0403, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0645, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0605, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0431, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0443, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0403, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0393, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0485, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0497, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0579, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0394, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0573, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0413, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0450, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0435, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0521, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0431, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0644, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0573, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0499, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0405, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0445, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0694, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0504, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0668, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0537, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0514, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0445, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0408, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0483, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0601, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0568, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0578, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0022, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0521, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0568, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0556, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0819, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0464, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0402, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0385, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0730, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0385, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0956, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0415, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0477, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0392, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0507, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0393, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0390, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0408, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0406, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0650, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0528, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0434, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0394, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0500, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0520, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0927, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0405, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0675, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0619, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0519, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0464, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0529, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0468, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0432, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0434, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0388, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0558, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0634, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0487, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0456, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0482, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0839, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0418, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0397, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0399, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0406, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0822, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0419, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0642, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0610, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0445, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0439, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0516, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0450, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0456, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0500, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0460, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0998, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0751, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0473, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0393, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0446, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0513, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0396, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0384, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0825, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0427, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0389, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0429, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0510, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0399, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0428, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0442, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0427, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0510, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0458, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0597, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0423, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0393, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0541, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0446, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0564, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0495, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0533, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0893, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0489, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0454, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0504, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0604, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0414, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0405, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0487, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0382, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0421, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0505, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0448, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0572, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0429, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0605, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0415, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0656, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0471, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0459, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0384, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0486, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0463, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0421, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0411, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0548, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0535, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0547, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0384, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0434, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0446, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0582, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0452, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0496, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0419, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0585, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0398, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0014, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0801, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0406, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0411, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0569, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0494, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0444, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0547, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0415, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0418, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0456, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0419, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0603, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0015, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0397, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0569, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0558, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0390, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0513, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0383, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0432, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0465, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0695, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0619, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0679, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0494, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0445, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0604, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0383, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0452, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0515, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0432, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0486, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0532, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0512, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0398, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0501, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0404, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0394, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0411, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0643, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0432, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0382, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0401, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0468, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0581, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0457, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0437, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0554, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0532, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0402, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0734, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0725, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0419, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0395, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0624, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0462, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0645, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0384, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0521, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0379, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0519, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0494, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0620, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0384, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0523, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0490, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0577, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0519, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0503, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0606, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0463, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0583, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0583, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0511, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0496, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0458, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0394, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0599, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0423, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0622, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0522, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0447, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0407, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0721, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0382, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0592, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0476, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0708, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0412, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0811, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0397, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0416, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0532, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0544, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0427, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0395, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0507, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0537, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0528, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0733, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0502, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0833, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0541, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0433, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0605, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0492, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0510, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0539, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0384, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0417, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0414, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0519, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0453, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0450, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0673, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0435, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0537, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0549, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0391, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0394, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0501, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0433, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0446, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0412, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0392, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0495, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0432, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0407, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0389, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0452, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0406, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0456, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0413, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0438, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0484, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0500, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0421, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0384, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0519, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0443, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0512, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0685, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0448, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0436, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0540, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0428, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0821, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0471, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0456, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0753, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0544, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0412, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0437, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0745, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0429, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0428, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0625, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0461, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0481, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0591, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0463, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0621, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0408, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0699, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0551, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0452, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0384, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0413, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0610, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0523, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0507, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0508, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0442, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0585, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0591, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0481, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0575, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0529, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0487, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0485, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0540, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0384, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0423, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0452, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0495, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0479, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0448, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0393, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0476, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0437, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0458, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0643, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0586, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0505, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0407, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0382, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0727, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0384, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0494, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0437, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0407, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0399, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0790, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0547, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0416, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0663, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0470, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0473, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0536, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0480, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0434, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0652, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0409, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0439, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0540, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0610, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0702, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0405, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0399, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0668, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0451, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0498, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0394, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0525, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0425, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0422, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0390, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0442, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0454, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0424, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0602, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0493, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0417, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0650, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0523, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0386, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0413, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0530, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0432, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0448, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0405, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0409, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0498, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0442, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0400, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0441, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0592, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0565, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0575, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0399, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0577, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0409, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0409, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0424, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0387, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0467, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0400, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0407, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0383, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0437, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0411, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0947, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0597, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0452, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0505, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0439, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0432, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0525, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0449, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0639, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0590, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0544, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0560, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0483, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0394, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0646, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0558, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0676, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0405, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0401, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0383, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0465, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0398, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0458, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0403, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0413, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0434, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0478, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0436, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0411, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0391, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0437, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0459, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0445, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0413, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0532, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0587, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0492, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0536, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0536, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0514, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0421, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0403, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0463, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0387, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0462, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0604, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0515, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0900, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0407, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0473, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0466, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0421, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0423, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0518, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0641, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0583, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0447, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0524, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0545, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0403, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0403, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0486, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0550, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0410, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0486, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0400, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0593, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0466, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0390, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0456, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0431, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0431, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0576, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0514, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0428, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0698, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0496, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0465, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0486, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0530, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0503, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0410, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0517, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0430, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0413, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0553, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0624, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0420, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0665, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0626, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0465, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0454, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0566, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0556, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0811, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0470, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0559, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0386, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0395, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0458, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0665, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0451, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0443, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0464, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0614, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0444, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0390, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0621, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0593, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0396, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0388, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0401, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0853, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0383, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0567, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0447, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0402, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0702, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0507, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0407, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0496, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0423, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0399, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0435, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0439, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0520, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0435, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0406, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0464, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0690, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0405, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0596, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0691, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0421, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0527, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0410, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0406, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0454, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0405, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0419, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0017, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0437, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0408, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0433, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0587, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0399, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0431, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0778, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0531, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0410, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0455, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0414, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0387, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0511, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0449, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0506, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0391, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0579, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0398, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0630, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0595, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0621, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0424, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0724, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0456, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0409, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0444, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0427, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0435, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0388, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0582, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0644, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0551, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0501, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0567, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0423, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0412, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0417, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0392, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0422, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0494, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0597, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0574, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0496, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0454, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0435, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0446, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0435, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0541, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0385, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0581, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0424, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0478, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0497, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0409, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0414, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0437, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0557, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0475, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0439, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0428, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0382, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0635, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0483, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0501, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0458, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0500, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0507, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0429, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0472, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0389, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0507, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0463, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0413, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0396, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0398, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0707, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0399, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0478, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0643, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0420, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0407, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0408, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0398, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0490, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0458, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0420, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0388, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0542, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0395, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0477, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0496, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0549, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0486, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0558, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0383, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0436, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0397, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0456, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0624, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0423, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0547, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0383, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0436, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0393, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0456, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0421, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0421, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0519, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0510, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0396, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0589, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0428, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0379, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0582, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0431, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0394, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0428, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0496, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0443, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0390, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0466, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0505, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0570, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0405, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0451, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0522, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0429, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0563, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0455, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0447, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0430, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0508, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0379, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0479, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0445, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0482, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0512, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0512, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0601, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0401, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0543, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0588, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0397, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0606, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0464, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0499, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0447, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0435, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0486, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0400, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0429, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0510, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0480, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0479, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0405, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0401, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0431, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0388, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0433, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0411, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0500, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0458, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0479, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0556, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0481, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0390, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0502, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0478, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0476, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0518, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0468, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0531, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0392, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0457, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0425, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0545, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0620, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0477, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0551, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0389, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0414, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0450, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0393, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0409, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0437, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0721, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0486, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0506, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0507, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0443, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0656, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0393, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0429, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0558, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0419, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0584, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0407, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0482, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0437, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0411, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0417, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0501, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0388, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0434, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0404, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0456, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0379, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0411, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0567, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0665, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0476, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0400, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0406, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0397, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0577, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0384, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0428, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0407, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0398, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0646, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0475, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0645, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0514, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0408, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0862, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0501, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0418, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0410, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0388, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0435, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0452, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0494, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0573, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0412, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0382, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0459, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0389, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0405, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0479, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0719, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0476, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0580, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0394, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0466, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0396, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0403, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0422, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0453, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0388, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0667, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0587, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0469, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0419, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0487, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0468, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0464, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0392, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0446, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0453, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0492, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0385, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0395, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0506, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0443, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0461, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0570, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0425, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0449, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0643, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0436, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0454, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0407, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0427, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0504, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0545, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0528, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0412, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0580, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0658, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0389, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0434, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0462, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0436, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0426, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0546, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0421, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0505, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0382, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0422, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0974, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0495, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0400, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0499, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0390, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0395, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0643, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0434, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0537, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0436, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0587, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0484, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0379, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0383, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0434, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0487, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0391, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0006, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0433, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0439, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0496, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0429, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0485, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0441, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0391, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0415, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0628, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0491, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0457, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0441, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0421, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0658, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0524, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0516, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0618, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0379, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0735, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0569, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0403, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0435, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0423, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0666, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0383, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0853, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0541, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0386, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0011, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0419, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0495, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0425, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0475, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0415, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0418, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0669, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1012, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0525, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0415, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0561, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0764, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0393, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0447, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0393, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0388, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0436, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0409, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0507, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0452, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0472, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0543, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0419, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0390, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0445, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0396, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0412, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0451, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0582, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0618, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0812, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0448, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0464, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0392, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0456, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0917, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0544, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0401, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0435, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0479, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0760, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0588, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0499, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0551, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0489, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0444, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0655, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0434, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0567, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0417, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0535, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0575, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0445, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0472, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0540, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0621, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0402, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0663, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0441, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0426, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0392, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0457, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0631, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0436, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0424, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0464, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0643, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0459, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0523, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0539, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0464, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0591, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0450, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0600, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0516, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0687, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0755, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0439, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0461, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0402, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0471, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0497, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0389, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0455, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0483, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0586, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0385, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0583, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0610, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0416, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0514, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0416, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0536, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0426, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0461, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0456, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0659, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0389, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0386, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0437, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0586, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0416, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0390, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0407, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0407, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0727, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0025, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0422, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0453, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0396, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0676, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0386, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0470, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0412, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0393, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0018, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0390, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0444, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0661, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0550, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0482, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0485, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0644, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0387, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0474, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0457, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0735, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0479, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0441, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0448, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0418, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0390, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0017, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1012, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0408, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0454, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0498, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0417, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0453, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0482, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0540, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0539, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0435, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0445, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0546, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0379, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0520, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0494, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0428, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0423, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0409, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0540, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0467, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0489, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0531, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0385, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0572, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0416, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0536, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0461, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0417, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0442, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0478, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0412, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0442, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0612, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0388, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0393, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0511, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0388, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0446, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0411, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0421, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0025, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0680, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0410, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0468, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0840, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0549, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0421, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0630, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0495, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0516, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0453, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0540, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0492, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0521, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0860, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0475, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0556, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0439, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0384, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0383, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0601, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0662, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0400, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0379, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0596, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0614, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0408, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0481, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0495, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0436, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0404, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0417, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0411, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0404, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0010, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0400, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0432, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0416, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0954, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0386, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0449, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0458, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0564, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0400, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0795, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0420, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0467, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0445, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0528, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0439, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0509, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0719, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0538, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0556, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0460, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0539, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0620, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0462, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0462, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0496, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0534, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0379, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0382, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0464, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0431, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0436, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0527, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0526, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0600, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0462, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0398, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0407, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0492, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0424, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0453, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0538, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0432, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0397, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0615, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0487, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0633, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0432, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0456, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0503, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0500, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0421, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0410, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0454, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0436, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0560, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0438, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0642, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0545, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0633, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0478, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0604, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0579, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0415, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0423, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0437, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0752, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0480, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0415, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0652, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0446, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0470, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0598, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0684, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0406, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0428, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0453, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0615, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0498, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0397, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0411, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0382, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0620, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0486, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0586, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0420, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0772, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0452, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0407, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0403, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0408, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0019, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0393, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0507, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0540, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0809, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0525, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0469, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0401, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0486, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0430, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0591, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0464, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0465, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0442, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0578, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0657, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0432, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0446, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0432, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0437, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0439, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0942, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0420, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0464, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0487, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0423, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0595, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0484, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0426, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0994, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0558, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0633, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0604, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0419, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0448, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0505, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0480, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0514, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0437, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0424, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0484, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0545, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0394, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0494, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0408, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0520, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0530, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0454, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0536, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0399, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0410, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0512, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0670, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0610, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0465, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0477, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0439, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0674, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0411, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0585, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0491, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0516, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0598, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0416, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0464, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0487, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0410, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0426, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0650, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0542, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0545, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0518, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0567, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0386, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0442, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0571, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0400, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0408, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0437, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0474, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0421, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0412, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0493, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0607, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0441, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0425, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0530, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0450, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0389, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0508, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0498, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0420, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0393, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0407, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0505, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0386, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0436, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0405, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0435, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0486, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0486, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0406, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0490, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0416, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0410, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0393, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0424, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0432, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0473, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0492, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0457, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0397, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0528, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0421, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0601, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0610, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0753, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0586, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0416, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0424, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0483, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0379, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0489, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0505, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0424, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0523, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0438, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0433, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0517, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0654, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0457, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0391, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0495, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0436, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0024, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0387, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0494, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0383, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0598, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0382, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0511, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0405, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0452, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0438, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0630, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0435, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0533, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0391, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0416, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0704, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0409, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0573, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0722, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0487, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0606, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0478, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0586, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0444, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0419, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0403, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0549, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0437, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0704, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0653, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0574, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0551, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0511, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0427, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0475, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0517, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0417, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0437, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0446, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0415, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0584, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0411, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0622, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0445, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0523, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0477, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0468, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0418, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0384, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0460, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0400, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0549, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0671, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0452, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0547, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0541, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0627, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0383, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0484, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0653, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0504, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0467, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0412, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0487, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0452, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0701, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0430, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0457, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0400, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0430, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0451, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0382, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0470, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0441, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0406, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0613, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0463, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0416, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0486, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0432, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0484, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0535, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0584, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0453, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0467, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0684, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0526, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0463, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0576, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0436, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0506, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0389, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0398, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0828, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0427, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0379, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0463, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0586, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0392, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0470, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0416, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0533, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0399, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0472, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0875, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0715, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0548, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0461, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0412, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0593, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0938, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0451, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0653, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0605, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0679, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0704, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0479, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0410, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0495, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0482, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0418, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0395, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0450, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0558, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0386, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0014, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0005, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0412, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0465, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0586, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0383, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0384, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0386, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0619, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0397, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0485, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0462, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0517, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0385, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0382, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0470, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0454, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0527, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0445, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0471, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0421, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0448, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0530, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0532, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0584, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0426, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0484, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0425, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0379, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0482, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0604, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0402, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0402, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0545, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0598, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0387, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0446, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0386, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0566, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0586, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0754, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0844, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0394, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0468, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0454, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0011, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0535, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0529, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0447, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0446, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0511, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0468, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0382, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0393, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0469, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0465, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0491, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0395, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0488, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0384, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0617, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0443, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0455, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0553, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0617, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0399, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0575, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0421, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0480, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0505, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0398, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0434, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0448, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0538, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0661, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0536, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0385, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0629, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0649, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0411, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0507, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0422, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0432, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0611, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0471, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0564, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0387, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0496, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0653, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0634, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0425, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0396, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0481, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0445, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0566, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0473, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0399, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0430, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0539, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0440, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0472, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0581, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0388, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0563, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0398, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0481, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0390, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0580, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0431, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0534, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0628, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0494, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0383, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0398, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0436, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0496, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0384, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0932, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0571, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0386, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0477, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0464, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0438, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0466, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0577, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0450, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0441, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0434, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0410, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0606, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0396, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0395, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0566, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0710, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0488, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0462, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0534, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0438, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0435, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0432, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0578, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0415, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0463, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0412, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0455, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0485, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0389, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0379, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0795, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0480, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0454, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0461, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0471, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0436, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0395, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0429, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0406, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0417, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0539, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0473, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0390, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0430, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0450, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(3.8109e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0400, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0395, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0618, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0504, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0432, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1005, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0592, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0394, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0384, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0467, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0785, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0385, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0543, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0543, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0482, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0487, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0993, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0425, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0430, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0582, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0442, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0916, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0424, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0425, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0024, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0586, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0407, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0519, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0632, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0570, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0487, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0452, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0534, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0455, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0441, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0580, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0513, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0462, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0418, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0395, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0636, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0426, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0426, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0396, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0402, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0419, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0394, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0448, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0454, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0745, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0427, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0428, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0421, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0389, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0460, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0409, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0664, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0441, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0471, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0659, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0513, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0438, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0496, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0485, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0521, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0468, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0677, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0414, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0561, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0415, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0470, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0396, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0410, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0399, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0503, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0484, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0533, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0442, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0382, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0425, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0379, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0477, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0771, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0526, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0477, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0428, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0412, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0663, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0494, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0383, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0435, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0024, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0017, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0002, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0014, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0023, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0023, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0669, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0488, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0421, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0642, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0479, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0003, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0649, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0407, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0495, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0442, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0439, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0443, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0422, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0420, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0021, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0440, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0413, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0405, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0734, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0518, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0454, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0549, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0529, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0519, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0436, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0383, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0523, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0503, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0489, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0418, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0436, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0392, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0006, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0437, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0483, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0450, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0386, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0582, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0557, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0389, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0598, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0574, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0493, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0417, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0433, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0545, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0413, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0379, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0444, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0533, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0481, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0382, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0539, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0516, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0444, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0514, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0485, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0455, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0438, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(3.2569e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0454, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0403, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0435, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0404, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0479, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0554, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0429, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0552, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0453, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0454, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0388, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0434, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0534, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0394, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0383, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0416, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0451, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0398, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0403, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0577, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0396, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0489, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0498, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0485, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0418, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0678, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0458, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0512, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0385, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0541, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0563, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0643, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(6.4319e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0002, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0021, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0005, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(2.7212e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0582, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0519, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0562, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0384, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0587, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0466, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0587, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0445, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0603, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0004, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0483, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0437, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0621, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0565, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0382, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0416, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0419, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0470, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0454, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0436, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0421, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0435, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0813, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0534, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0395, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0387, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0396, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0460, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0666, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0434, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0588, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0475, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0443, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0570, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0504, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0611, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0498, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0413, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0564, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0739, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0422, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0388, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0520, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0425, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0470, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0410, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0449, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0540, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0409, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0442, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0541, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0620, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0531, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0456, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0547, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0387, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0465, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0434, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0433, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0610, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0471, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0388, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0517, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0488, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0452, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0388, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0520, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0405, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0433, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0405, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0461, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0605, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0418, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0382, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0435, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0391, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0394, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0696, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0393, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0451, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0406, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0543, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0494, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0423, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0478, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0434, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0530, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0837, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0608, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0386, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0759, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0473, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0527, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0002, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0508, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0025, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0003, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(9.4410e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0482, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0498, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0393, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0023, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0401, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0644, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0503, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0707, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0750, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0423, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0609, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0426, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0379, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0396, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0495, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0409, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0871, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0475, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0448, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0637, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0387, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0405, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0422, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0390, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0422, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0007, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0025, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0020, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0024, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(4.4109e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0455, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0008, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0591, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0481, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0009, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0611, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0001, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0408, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0662, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0018, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0425, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0496, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0006, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0885, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0441, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0023, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0408, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0403, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0495, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0467, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0924, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0528, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0625, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0516, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0430, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0424, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0500, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0625, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0448, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0823, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0424, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0462, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(5.4224e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0623, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0697, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0468, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0392, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0379, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0384, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0382, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0627, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0550, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0460, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0433, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0423, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0462, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0591, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0455, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0416, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0543, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0384, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0483, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0475, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0423, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0603, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0640, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0394, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0946, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0503, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0416, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0464, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0416, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0664, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0023, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0670, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0410, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0465, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0670, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0447, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0635, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0021, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0629, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0025, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0407, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0007, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0003, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0012, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(2.3776e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0019, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0473, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0439, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0005, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(2.7731e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0012, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0493, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(4.8868e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0009, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0009, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(3.4644e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0379, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0003, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0006, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0018, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0010, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(2.5725e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0011, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0009, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0002, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0493, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0025, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0420, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(6.3630e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(2.4191e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(2.4191e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(2.4162e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0469, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0392, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0405, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0741, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0475, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0611, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0398, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0396, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0425, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(3.5972e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0428, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0497, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0644, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0766, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0499, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0507, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0613, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0650, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0451, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0472, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0833, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0463, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0413, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0447, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0434, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0534, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0454, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0436, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0407, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0507, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0562, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0689, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0382, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0528, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0474, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0562, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0470, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0487, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0540, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0532, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0445, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0422, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0428, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0385, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0585, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0388, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0523, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0432, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0426, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0387, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0390, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0505, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(2.9495e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0016, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0004, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(5.7182e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0001, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0007, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0002, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(3.5898e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0011, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0016, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0014, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0011, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(2.3361e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0423, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(6.8921e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0017, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0017, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0022, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0015, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0008, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0001, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(2.7023e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(7.3325e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(9.8777e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0003, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0486, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0021, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0020, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0021, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0021, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0005, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0023, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0019, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(3.9911e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(3.7338e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0395, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(2.0736e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0005, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(2.7840e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(2.5159e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(2.7148e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(6.0980e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0013, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0007, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(2.0907e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0409, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0632, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0478, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0490, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0825, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0429, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0389, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0020, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0019, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0019, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0019, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0019, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0019, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0019, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0018, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0018, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0018, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0017, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0017, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0017, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0418, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0480, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0007, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0580, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0621, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0558, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0431, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0404, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0746, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0458, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0445, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0417, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0492, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0390, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0846, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0503, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0462, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0407, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0537, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0428, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0811, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0458, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0498, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0442, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0464, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0394, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0422, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0456, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0004, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0018, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0469, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0389, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0540, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0424, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(5.0264e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0015, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0010, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0554, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0006, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0007, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0010, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0454, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0010, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0557, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0392, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0011, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0018, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0400, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0009, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0502, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0508, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0003, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0021, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0010, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0007, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0385, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0011, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0407, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0462, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0410, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0520, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0389, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0943, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0412, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0458, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0385, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0548, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0018, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0002, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0542, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0407, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0430, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0433, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0590, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0408, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0480, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0575, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0432, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0406, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0401, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0407, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0470, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0413, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0010, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0402, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0432, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0532, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0382, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0596, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0393, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0448, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0580, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0539, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0682, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0004, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0010, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(3.0057e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0009, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0003, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0002, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(1.7176e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0017, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0419, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0023, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0001, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0021, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0422, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0011, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0012, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0695, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0441, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0622, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0402, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0471, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0387, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0527, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0464, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0023, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0011, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0441, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0399, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0405, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0402, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0442, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0480, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0416, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0736, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0468, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0383, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0448, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0523, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0431, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0572, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0426, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0382, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0446, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0463, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0412, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0518, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0382, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0551, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0386, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0769, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0414, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0499, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0420, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0455, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0463, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0571, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0379, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0461, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0669, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0478, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0024, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0764, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0393, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0384, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0471, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0546, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0679, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0397, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0446, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0667, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0393, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0629, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0017, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0389, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0010, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0024, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0015, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0022, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0424, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0482, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0594, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0454, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0522, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0383, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0402, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0423, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0445, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0539, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0434, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0387, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0631, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0423, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0458, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0386, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0379, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0425, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0501, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0439, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0452, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0455, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0543, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0597, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0727, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0606, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0415, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0547, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0439, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0383, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0388, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0396, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0466, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0389, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0515, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0621, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0382, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0460, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0384, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0391, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0413, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0384, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0403, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0382, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0862, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0437, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0452, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0498, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0436, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0449, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0397, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0610, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0432, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0428, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0509, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0474, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0885, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0495, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0541, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0509, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0465, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0546, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0434, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0488, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0397, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0407, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0386, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0385, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0951, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0753, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0443, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0459, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0489, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0512, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0565, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0416, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0711, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0429, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0498, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0438, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0576, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0487, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0528, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0422, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0417, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0442, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0555, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0793, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0642, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0604, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0410, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0662, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0402, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0414, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0466, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0386, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0642, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0634, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0539, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0440, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0422, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0458, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0461, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0390, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0568, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0464, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0729, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0530, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0507, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0525, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0459, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0444, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0443, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0484, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0491, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0509, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0546, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0385, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0392, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0485, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0408, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0413, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0384, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0649, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0593, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0395, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0410, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0856, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0400, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0397, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0495, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0388, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0580, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0018, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0546, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0418, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0429, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0452, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0402, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0404, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0494, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0437, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0543, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0506, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0476, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0463, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0456, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0593, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0504, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0383, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0396, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0432, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0384, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0511, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0457, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0457, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0403, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0451, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0385, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0509, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0435, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0703, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0387, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0425, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0411, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0550, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0391, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0470, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0604, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0455, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0508, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0394, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0382, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0460, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0516, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0525, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0729, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0413, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0527, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0464, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0418, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0541, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0400, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0449, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0528, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0424, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0429, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0549, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0633, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0473, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0410, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0458, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0674, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0472, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0508, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0417, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0426, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0572, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0484, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0488, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0498, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0411, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0464, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0432, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0391, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0509, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0388, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0417, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0492, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0571, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0399, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0477, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0414, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0436, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0385, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0416, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0528, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0417, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0505, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0590, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0472, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0571, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0408, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0493, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0456, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0524, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0492, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0692, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0489, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0451, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0491, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0759, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0460, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0429, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0659, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0383, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0388, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0391, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0427, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0409, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0479, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0462, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0458, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0406, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0425, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0503, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0391, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0520, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0439, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0599, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0540, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0555, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0521, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0411, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0462, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0430, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0403, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0382, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0459, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0471, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0543, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0384, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0638, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0391, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0541, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0567, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0500, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0449, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0490, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0393, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0428, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0406, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0401, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0417, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0494, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0656, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0397, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0494, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0407, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0397, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0471, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0418, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0421, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0440, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0587, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0391, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0543, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0434, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0595, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0776, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0489, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0384, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0392, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0685, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0390, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0484, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0497, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0628, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0515, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0403, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0471, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0463, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0448, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0384, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0413, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0527, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0446, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0464, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0532, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0447, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0418, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0839, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0522, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0512, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0444, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0481, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0379, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0653, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0515, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0682, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0671, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0644, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0630, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0507, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0456, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0412, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0746, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0828, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0430, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0513, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0513, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0417, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0423, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0473, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0395, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0720, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0671, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0717, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0406, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0439, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0401, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0679, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0832, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0414, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0384, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0668, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0507, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0393, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0436, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0487, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0389, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0607, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0434, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0401, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0448, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0575, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0402, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0618, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0437, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0495, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0572, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0513, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0404, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0563, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0463, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0379, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0538, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0403, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0424, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0446, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0411, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0457, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0478, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0392, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0495, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0405, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0379, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0452, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0497, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0467, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0598, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0542, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0408, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0563, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0514, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0675, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0772, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0021, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0582, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0427, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0571, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0475, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0396, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0603, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0544, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0448, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0420, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0551, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0484, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0574, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0418, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0903, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0444, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0463, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0433, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0406, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0441, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0402, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0438, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0437, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0402, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0616, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0386, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0385, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0608, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0449, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0777, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0531, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0603, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0506, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0404, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0426, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0526, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0481, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0463, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0410, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0469, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0547, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0443, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0553, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0651, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0405, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0395, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0531, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0421, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0400, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0484, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0393, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0399, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0432, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0478, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0423, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0510, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0508, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0574, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0612, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0395, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0392, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0458, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0484, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0401, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0537, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0436, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0451, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0631, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0428, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0386, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0434, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0483, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0408, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0470, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0534, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0605, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0554, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0402, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0383, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0518, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0382, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0547, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0398, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0438, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0393, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0010, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0660, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0016, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0385, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0405, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0465, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0416, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0486, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0592, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0501, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0024, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0426, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0446, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0439, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0420, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0391, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0487, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0526, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0440, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0430, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0493, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0477, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0542, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0581, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0486, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0387, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0386, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0418, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0433, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0411, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0684, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0580, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0446, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0440, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0405, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0455, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0383, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0446, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0449, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0564, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0410, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0459, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0503, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0438, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0404, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0861, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0526, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0471, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0596, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0385, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0446, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0440, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1442, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0656, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0699, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0014, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0479, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0458, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0445, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0024, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0710, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0615, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0385, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0416, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0409, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0424, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0426, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0489, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0454, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0558, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0477, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0478, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0397, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0385, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0500, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0539, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0387, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0394, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0458, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0485, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0430, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0513, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0472, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0393, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0397, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0485, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0504, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0449, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0459, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0427, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0592, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0420, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0433, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0461, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0423, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0508, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0464, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0408, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0770, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0399, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0508, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0395, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0519, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0425, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0399, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0500, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0407, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0554, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0486, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0409, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0400, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0401, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0509, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0414, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0403, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0439, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0461, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0527, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0444, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0398, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0421, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0417, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0016, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0002, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0464, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0423, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0481, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0735, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0498, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0518, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0784, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0391, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0454, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0448, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0387, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0413, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0451, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0439, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0012, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0525, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0413, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0506, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0401, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0521, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0472, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0387, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0563, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0434, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0531, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0683, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0551, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0565, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0462, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0406, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0448, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0460, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0399, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0460, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0789, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0490, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0598, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0394, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0398, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0483, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0463, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0460, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0435, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0587, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0472, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0445, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0527, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0467, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0483, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0610, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0481, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0465, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0389, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0388, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0424, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0398, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0516, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0474, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0786, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0437, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0729, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0507, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0570, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0386, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0393, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0423, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0670, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0427, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0385, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0415, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0665, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0408, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0020, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0645, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0480, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0406, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0404, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0398, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0502, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0515, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0476, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0496, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0457, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0546, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0440, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0533, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0396, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0406, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0388, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0388, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0413, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0008, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0490, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0541, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0513, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0527, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0382, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0500, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0451, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0530, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0795, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0439, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0446, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0405, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0630, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0439, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0466, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0017, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0610, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0411, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0408, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0383, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0429, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0478, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0447, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0474, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0394, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0460, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0611, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0422, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0411, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0449, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0415, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0396, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0382, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0389, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0397, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0418, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0440, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0542, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0020, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0400, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0608, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0488, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0022, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0608, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0434, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0497, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0521, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0431, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0601, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0894, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0456, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0428, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0415, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0410, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0388, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0429, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0425, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0407, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0400, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0675, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0435, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0427, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0443, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0428, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0384, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0019, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0490, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0580, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0619, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0433, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0429, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0443, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0388, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0384, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0411, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0496, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0557, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0431, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0524, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0404, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0446, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0451, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0019, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0457, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0452, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0379, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0388, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0590, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0388, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0438, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0465, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0441, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0501, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0512, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0620, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0587, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0494, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0407, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0413, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0557, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0398, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0406, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0405, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0408, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0413, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0451, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0441, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0504, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0485, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0384, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0410, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0387, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0025, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0642, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0453, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0437, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0429, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0390, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0979, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0671, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0479, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0445, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0555, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0831, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0489, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0416, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0394, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0427, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0411, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0452, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0498, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0451, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0386, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0432, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0463, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0533, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0431, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0418, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0394, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0593, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0389, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0467, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0394, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0526, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0386, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0421, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0435, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0531, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0387, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0434, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0426, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0538, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0387, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0455, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0432, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0488, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0566, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0385, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0512, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0649, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0502, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0436, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0017, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0416, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0548, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0609, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0435, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0398, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0434, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0437, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0674, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0420, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0475, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0496, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0531, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0440, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0560, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0488, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0423, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0431, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0397, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0604, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0429, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0442, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0409, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0515, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0410, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0492, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0418, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0414, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0501, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0451, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0513, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0494, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0470, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0518, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0505, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0470, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0420, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0438, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0459, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0528, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0436, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0488, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0384, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0427, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0484, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0434, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0501, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0797, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0449, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0419, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0424, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0592, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0382, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0488, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0412, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0407, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0547, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0606, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0480, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0502, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0539, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0428, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0495, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0441, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0383, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0483, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0463, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0439, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0502, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0018, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0394, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0433, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0784, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0719, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0016, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0709, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0445, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0538, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0524, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0560, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0439, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0388, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0407, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0428, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0024, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0479, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0385, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0388, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0405, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0788, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0436, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0520, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0445, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0421, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0447, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0701, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0833, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0415, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0401, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0468, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0389, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0467, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0562, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0386, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0570, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0453, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0391, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0415, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0511, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0405, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0465, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0422, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0450, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0768, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0488, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0445, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0425, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0657, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0426, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0399, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0424, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0385, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0405, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0410, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0398, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0388, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0390, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0536, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0408, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0016, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0547, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0416, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0020, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0023, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0404, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0410, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0416, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0405, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0428, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0389, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0749, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0405, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0020, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0462, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0536, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0490, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0024, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0425, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0395, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0510, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0470, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0435, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0392, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0517, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0510, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0468, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0485, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0404, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0389, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0478, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0484, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0626, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0388, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0387, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0406, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0408, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0430, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0406, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0386, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0025, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0469, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0456, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0407, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0410, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0387, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0597, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0475, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0379, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0722, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0433, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0409, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0442, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0396, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0433, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0446, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0411, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0485, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0703, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0503, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0013, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0573, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0434, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0384, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0388, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0018, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0434, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0459, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0424, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0743, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0014, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0557, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0667, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0021, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0458, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0404, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0533, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0408, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0461, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0392, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0422, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0432, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0446, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0401, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0493, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0454, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0386, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0399, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0542, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0438, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0404, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0008, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0396, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0604, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0384, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0567, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0512, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0408, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0506, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0654, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0434, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0566, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0491, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0484, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0412, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0603, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0615, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0451, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0407, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0388, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0501, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0480, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0425, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0399, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0408, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0384, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0018, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0459, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0431, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0470, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0015, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0450, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0404, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0536, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0474, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0015, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0481, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0460, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0414, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0400, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0402, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0401, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0435, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0399, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0393, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0414, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0396, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0498, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0387, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0469, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0400, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0004, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0465, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0591, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0021, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0519, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0446, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0399, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0421, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0469, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0015, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0505, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(3.4826e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(8.1938e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0504, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0007, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0016, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0562, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0434, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0605, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0504, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0569, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0406, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0012, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0557, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0014, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0009, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0014, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0492, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0429, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0475, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0502, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0020, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0491, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0417, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0001, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0005, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0015, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0446, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0400, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0379, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0513, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0404, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0408, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0617, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0607, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0897, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0453, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0003, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0025, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0535, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0412, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0025, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0418, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0483, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0524, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0572, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0021, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0021, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0013, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0019, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0769, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0394, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0405, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0426, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0456, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0016, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0413, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0009, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0009, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0001, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0001, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0004, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(8.1434e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(7.6668e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(7.4493e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0008, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0013, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0013, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(7.5462e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0011, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(8.1556e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(7.3519e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0012, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0023, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0001, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0011, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(8.0664e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0002, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(7.5913e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0001, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(7.2621e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0002, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(9.4857e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(7.1798e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(8.3007e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(8.6881e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0001, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(6.7061e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(7.4083e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(6.3775e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(8.7844e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0007, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(6.3153e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(6.0657e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(7.1604e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(6.4155e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0002, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(6.2042e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(6.1708e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(6.4480e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(6.6012e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(6.0017e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(6.1880e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(5.7198e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(5.7887e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(5.5807e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0007, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(7.9410e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(7.9644e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0007, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(7.3291e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(5.4737e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(7.3273e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0010, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(5.9875e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0001, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(5.6892e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(6.1977e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0025, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0005, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(5.6982e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(5.1116e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(5.0832e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(5.3442e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(6.1170e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0022, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(4.9875e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(5.7001e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0001, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0011, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(5.5035e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0010, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(6.0140e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(4.8902e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(4.9887e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(6.3405e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(5.1307e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(5.2888e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(4.8608e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0006, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0006, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(6.3027e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(4.7392e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(8.3309e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0006, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(4.4850e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(5.6937e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0006, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0006, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0006, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0017, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(5.0663e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0005, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0002, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(4.3316e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0005, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0001, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0005, device='cuda:0', grad_fn=<NllLossBackward>)
--------------EPOCH 3-------------
Test Accuracy: tensor(0.4262, device='cuda:0')
Loss: tensor(0.1251, device='cuda:0', grad_fn=<NllLossBackward>)
Saving model at iteration: 0
Loss: tensor(0.0836, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0420, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0696, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1506, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0604, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0913, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0551, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0753, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0494, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0624, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0533, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0787, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0593, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0423, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0672, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0763, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0682, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0893, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0414, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0521, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0479, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0517, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0686, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0446, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0975, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0429, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0651, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0480, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0401, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0786, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0457, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0655, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0854, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0865, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0525, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0632, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0485, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0696, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0465, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0476, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0668, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0559, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0486, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0548, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0631, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0473, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0410, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0716, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0441, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0436, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0470, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0618, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0504, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0601, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0587, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0514, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0598, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0512, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0394, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0586, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0606, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1019, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0697, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0633, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0920, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0490, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0405, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0390, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0571, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0608, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0537, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0614, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0696, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0808, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0760, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0692, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0462, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0872, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0566, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0410, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0645, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0447, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0855, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0593, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0813, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0744, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0667, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0548, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0621, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0709, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0480, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0597, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0697, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0703, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0441, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0522, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0574, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0686, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0382, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0599, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0506, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0489, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0738, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0694, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0869, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0571, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0444, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0625, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0537, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0409, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0744, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0519, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0439, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0771, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0655, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0460, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0609, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0562, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0962, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0543, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0547, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0484, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0441, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0738, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0434, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0678, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0822, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0692, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0975, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0638, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0475, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0879, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0955, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0468, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0535, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0736, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0664, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0635, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0530, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0595, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0485, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0595, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0595, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0899, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0521, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0483, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0456, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0797, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0579, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0882, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0480, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0717, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0568, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0519, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0558, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0696, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0500, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0409, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0461, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0464, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0475, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0517, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0513, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0619, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0597, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0408, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0414, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0475, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0726, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0789, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0558, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0812, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0646, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0474, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0610, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0382, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0623, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0547, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0534, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0623, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0452, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0379, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0835, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0394, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0498, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0424, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0664, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0530, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0629, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0427, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0467, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0681, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0451, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0468, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0489, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0620, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0975, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0857, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0653, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0613, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0775, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0872, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0637, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0755, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0751, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0471, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0662, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0567, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0481, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0961, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0630, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0913, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0790, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0494, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0582, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0517, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0422, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0799, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0451, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0873, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0483, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0498, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0738, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0699, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0832, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0548, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0673, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0500, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0414, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0451, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0611, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0709, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0568, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0474, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0804, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0785, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0659, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0605, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0707, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0600, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0633, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0416, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0430, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0459, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0407, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0382, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0750, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0608, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0441, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0507, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0514, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0659, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0679, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0683, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0437, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0624, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0540, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0413, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0568, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0544, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0520, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0526, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0633, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0730, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0428, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0567, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0764, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0485, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0587, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0557, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0568, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0402, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0628, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0507, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0531, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0406, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0527, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0579, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0678, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0599, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0507, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0649, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0513, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0464, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0908, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0623, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0518, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0470, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0487, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0541, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0459, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0574, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0566, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0804, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0423, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0453, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0611, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0410, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0673, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0512, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0424, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0649, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0523, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0425, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0570, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0435, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0650, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0458, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0601, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0396, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0672, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0867, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0799, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0625, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0457, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0407, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0657, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0500, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0875, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0560, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0414, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0925, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0524, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0506, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0647, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0596, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0510, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0746, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0722, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0673, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0505, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0481, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0587, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0542, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0442, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0420, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0433, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0421, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0700, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0415, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0712, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0453, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0732, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0550, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0495, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0477, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0567, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0632, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0803, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0742, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0720, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0496, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0675, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0954, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0424, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0574, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0379, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0521, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0570, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0549, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0489, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0548, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0489, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0390, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0582, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0468, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0441, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0691, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0610, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0820, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0601, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0611, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0496, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0471, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0591, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0398, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0991, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0915, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0845, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0409, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0935, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0656, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0536, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0648, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0675, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0404, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0503, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0439, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0506, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0577, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0586, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0707, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0437, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0438, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0562, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0575, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0723, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0888, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0603, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0619, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0408, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0562, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0514, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0619, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0563, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0883, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0446, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0572, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0508, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0388, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0520, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0448, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0767, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0496, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0525, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0558, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0697, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0407, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0712, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0778, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0467, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0820, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0598, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0472, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0640, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0727, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0471, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0694, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0761, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0705, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0420, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0472, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0461, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0577, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0518, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0391, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0559, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0458, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0588, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0586, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0415, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0959, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0387, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0468, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0439, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0543, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0800, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0451, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0584, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0793, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0420, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0670, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0638, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0725, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0406, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0583, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0871, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0691, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0860, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0796, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0641, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0479, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0489, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0666, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0552, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0562, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0591, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0383, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0436, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0508, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0614, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0965, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0557, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0395, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0604, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0626, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0722, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0411, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0466, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0797, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0568, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0783, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0436, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0769, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0502, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0481, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0518, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0537, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0657, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0474, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0477, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0790, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0627, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0757, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0627, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0496, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0485, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0473, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0624, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0546, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0528, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0892, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0553, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0599, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0422, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0589, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0601, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0824, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0526, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0761, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0674, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0446, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0443, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0574, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0767, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0433, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0682, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0824, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0490, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1021, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0804, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0553, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0581, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0565, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0634, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0876, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0628, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0739, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0522, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0798, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0560, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0537, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0773, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0573, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0413, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0525, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0385, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0681, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0460, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0729, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0790, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0868, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0557, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0468, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0770, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0579, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0951, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0823, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0609, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0540, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0405, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0646, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0585, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0807, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0433, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0436, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0556, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0661, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0457, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0741, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0641, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0550, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0391, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0594, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0629, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0473, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0577, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0501, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0526, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0496, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0666, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0570, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0544, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0481, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0439, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0758, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0663, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0984, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0882, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0503, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0619, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0540, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0677, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0603, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0414, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0388, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0668, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0390, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0438, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0449, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0715, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0662, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0517, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0566, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0541, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0500, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0480, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0724, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0678, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0914, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0520, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0562, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0593, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0603, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0652, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0591, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0401, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0532, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0683, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0758, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0446, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0643, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0677, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0631, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0393, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0677, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0489, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0649, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0541, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0483, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0383, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0526, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0689, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0429, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0496, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0403, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0526, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0474, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0630, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0743, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0725, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0687, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0799, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0686, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0654, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0503, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0487, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0437, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0484, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0563, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0452, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0408, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0553, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0462, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0755, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0712, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0676, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0817, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0626, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0711, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0712, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0701, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0547, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0961, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0778, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0603, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0594, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0582, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0599, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0527, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0404, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0581, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0468, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0428, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0384, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0847, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0498, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0780, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0607, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0451, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0534, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0855, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0685, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0464, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0446, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0593, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0762, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0566, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0642, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0625, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0638, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0397, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0974, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0604, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0693, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0418, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1020, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0468, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0686, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0688, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0465, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0468, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0589, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0737, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0434, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0879, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0379, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0546, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0747, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0796, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0543, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0580, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0578, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0398, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0942, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0393, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0606, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0696, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0535, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0709, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0959, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0703, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0384, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0736, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0432, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0817, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0578, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0675, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0483, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0598, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0605, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0564, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0715, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0487, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0618, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0631, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0956, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0635, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0673, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0904, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0608, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0665, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0561, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0389, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0422, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0501, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0412, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0855, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0551, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0484, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0526, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0636, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0783, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0441, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0798, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0506, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0640, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0968, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0470, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0896, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0424, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0414, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0594, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0608, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0544, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0389, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0393, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0543, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0548, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0462, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0983, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0401, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0388, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0527, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0395, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0564, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0551, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0494, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0409, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0800, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0750, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0677, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0550, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0507, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0560, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0640, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0822, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0504, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0637, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0440, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0554, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0772, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0618, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0715, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0477, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0765, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0939, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0577, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0585, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0779, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0692, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0860, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0403, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0594, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0459, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0737, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0775, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0712, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0473, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0443, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0663, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0523, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0484, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0492, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0648, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0690, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0789, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0608, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0650, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0520, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0494, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0384, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0409, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0509, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0548, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0884, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0611, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0531, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0704, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0750, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0407, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0607, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0812, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0415, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0602, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0476, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0594, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0549, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0725, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0552, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0686, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0615, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0888, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0642, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0472, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0626, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0454, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0631, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0485, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0543, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0673, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0557, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0436, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0493, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0581, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0420, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0408, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0883, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0546, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0604, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0406, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0598, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0389, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0424, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0542, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0655, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0449, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0734, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0760, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0658, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0506, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0798, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0691, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0393, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0635, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0794, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0481, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0516, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0529, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0685, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0626, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0548, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0518, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0432, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0610, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0394, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0562, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0476, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0423, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0711, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0446, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0631, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0421, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0481, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0476, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0580, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0498, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0949, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0714, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0447, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0403, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0749, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0485, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0577, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0716, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0530, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0561, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0425, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0490, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0692, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0586, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0528, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0453, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0555, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0614, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0569, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0437, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0808, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0500, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0501, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0447, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0528, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0382, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0428, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0435, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0583, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0536, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0465, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0486, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0694, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0726, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0801, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0714, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0551, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0687, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0477, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0546, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0636, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0621, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0876, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0630, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0652, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0493, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0453, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0871, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0635, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0512, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0480, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0441, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0803, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0565, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0812, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0586, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0739, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0610, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0545, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0699, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0410, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0667, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0514, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0767, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0830, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0934, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0512, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0636, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0517, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0632, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0826, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0657, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0773, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0431, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0675, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0858, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0648, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0473, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0493, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0390, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0536, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0569, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0859, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0604, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0679, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0431, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0907, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0648, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0745, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0481, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0526, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0657, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0596, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0469, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0669, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0464, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0512, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0600, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0747, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0395, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0625, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0492, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0487, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0893, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0561, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0865, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0931, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0437, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0561, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0782, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0872, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0559, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0638, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0491, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0871, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0721, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0730, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0540, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0680, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0451, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0423, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0424, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0493, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0658, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0496, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0607, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0433, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0637, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1007, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0503, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0399, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0537, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0715, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0474, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0483, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0543, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0544, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0479, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0557, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0388, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0697, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0781, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0400, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0736, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0672, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0633, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0780, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0584, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0554, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0449, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0379, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0674, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0887, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0648, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0451, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0474, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0577, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0581, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0479, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0425, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0542, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0659, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0536, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0758, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0521, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0453, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0575, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0497, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0642, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0433, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0627, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0525, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0444, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0544, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0532, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0447, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0421, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0621, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0792, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0448, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0652, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0403, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0870, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0610, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0643, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0912, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0527, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0639, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0542, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0743, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0716, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0697, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0419, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0669, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0392, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0557, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0786, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0536, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0503, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0536, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0768, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0710, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0410, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0536, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0593, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0569, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0645, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0626, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0569, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0522, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0529, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0574, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0509, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0722, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0575, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0446, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0585, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0434, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0789, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0664, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0419, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0416, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0618, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0393, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0485, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0591, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0450, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0415, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0393, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0453, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0679, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0647, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0560, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0484, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0555, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0509, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0525, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0668, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0627, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0620, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0458, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0395, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0429, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0585, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0699, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0586, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0707, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0531, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0457, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0513, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0812, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0560, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0400, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0397, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0631, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0613, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0536, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0397, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0434, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0581, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0514, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0392, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0785, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0481, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0446, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0388, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0474, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0420, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0671, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0652, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0401, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0663, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0720, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0573, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0616, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0466, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0677, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0693, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0409, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0388, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0388, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0412, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0673, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0605, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0781, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0552, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0637, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0763, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0394, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0647, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0766, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0496, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1006, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0598, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0590, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0515, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0468, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0543, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0586, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0894, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0456, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0638, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0564, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0685, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0675, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0700, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0940, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0429, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0429, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0621, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0698, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0623, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0526, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0772, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0783, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0518, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0572, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0534, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0519, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0532, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0455, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0860, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0791, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0530, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0623, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0536, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0720, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0630, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0653, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0530, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0776, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0551, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0726, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0750, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0493, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0401, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0878, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0763, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0484, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0517, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0394, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0430, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0787, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0742, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0401, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0479, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0612, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0795, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0469, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0462, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0430, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0566, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0737, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0469, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0706, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0640, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0412, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0462, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0725, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0558, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0445, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0491, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0605, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0624, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0531, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0693, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0393, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0525, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0598, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0396, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0441, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0506, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0614, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0777, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0450, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0451, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0554, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0593, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0470, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0539, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0456, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0587, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0465, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0960, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0421, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0716, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0456, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0684, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0549, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0545, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0536, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0917, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0912, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0618, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0398, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0416, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0476, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0471, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0416, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0455, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0470, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0403, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0889, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0443, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0958, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0701, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0419, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0416, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0848, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0530, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0631, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0570, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0697, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0909, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0503, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0498, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0540, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0570, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0646, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0547, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0571, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0579, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0771, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0423, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0528, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0905, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0749, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0765, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0629, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0860, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0408, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0658, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0685, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0489, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0785, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0468, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0721, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0660, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0472, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0691, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0685, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0547, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0686, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0565, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0791, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0384, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0438, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0392, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0604, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0533, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0432, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0618, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0888, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0549, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0441, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0684, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0577, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0634, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0533, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0472, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0382, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0933, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0688, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0483, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0618, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0611, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0716, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0563, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0559, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0441, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0869, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0511, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0943, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0501, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0531, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0475, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0534, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0655, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0523, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0417, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0917, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0444, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0629, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0544, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0481, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0674, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0465, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0493, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0627, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0429, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0472, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0551, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0554, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0632, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0614, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0745, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0842, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0445, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0643, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0906, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0553, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0430, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0444, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0379, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0536, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0620, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0591, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1021, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0552, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0562, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0478, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0408, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0477, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0533, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0386, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0600, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0540, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0511, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0441, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0677, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0463, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0704, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0589, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0459, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0748, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0601, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0683, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0637, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0604, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0709, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0599, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0852, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0602, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0668, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0750, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0797, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0830, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0440, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0696, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0496, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0877, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0551, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0698, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0414, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0787, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0516, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0946, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0549, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0712, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0628, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0396, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0443, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0423, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0424, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0475, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0434, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0692, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0407, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0454, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0432, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0546, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0616, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0509, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0668, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0406, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0706, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0505, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0688, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0607, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0615, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0437, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0680, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0468, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0890, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0438, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0535, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0469, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0589, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0478, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0502, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0400, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0437, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0562, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0472, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0640, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0757, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0862, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0489, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0383, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0585, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0812, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0617, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0496, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0663, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0445, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0617, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0475, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0698, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0560, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0583, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0637, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0409, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0706, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0420, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0465, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0844, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0734, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0412, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0800, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0583, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0527, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0456, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0394, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0589, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0423, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0538, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0482, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0572, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0773, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0384, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0591, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0510, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0770, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0474, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0604, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0634, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0393, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0491, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0399, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0730, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0936, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0763, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0497, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0614, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0384, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0443, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0454, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0385, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0448, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0708, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0379, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0446, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0629, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0725, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0692, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0589, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0831, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0433, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0606, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0529, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0422, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0556, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0929, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1565, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0567, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0552, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0833, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0399, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0434, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0774, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0476, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0539, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0556, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0506, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0390, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0594, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0554, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0494, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0767, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0515, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0409, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0600, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0755, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0497, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0579, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0807, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0648, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0503, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0692, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0482, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0423, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0508, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0491, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0742, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0382, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0860, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0405, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0454, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0580, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0419, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0671, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0400, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0515, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0700, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0914, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0576, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0675, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0655, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0439, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0770, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0646, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0681, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0774, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0488, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0550, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0403, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0430, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0406, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0790, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0457, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0569, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0898, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0559, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0872, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0540, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0793, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0531, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0408, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0554, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0582, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0596, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0385, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0882, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0503, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0421, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0478, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0791, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0425, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0487, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0421, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0617, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0883, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0713, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0806, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0828, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0537, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0462, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0785, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0455, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0713, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0459, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0452, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0407, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0588, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0414, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0406, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0590, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0636, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0513, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0615, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0482, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0450, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0659, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0750, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0600, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0647, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0601, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0570, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0800, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0524, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0524, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0501, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0956, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0647, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0611, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0439, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0491, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0787, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1006, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0511, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0484, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0675, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0560, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0560, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0434, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0702, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0666, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0772, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0511, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0688, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0614, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0961, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0420, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0711, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0587, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0670, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0848, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0506, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0600, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0534, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0632, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0462, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0699, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0572, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0675, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0805, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0546, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0399, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0527, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0465, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0655, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0689, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0509, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0521, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0488, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0604, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0515, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0550, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0787, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0549, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0485, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0482, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0556, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0734, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0714, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0405, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0582, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0821, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0503, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0611, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0810, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0534, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0664, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0590, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0462, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0466, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0718, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0424, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0463, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0501, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0647, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0401, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0455, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0523, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0575, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0492, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0639, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0996, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0393, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0641, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0409, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0680, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0466, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0510, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0480, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0616, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0422, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0679, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0530, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0518, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0991, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0434, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0656, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0558, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0409, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0811, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0884, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0918, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0594, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0635, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0914, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0456, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0681, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0670, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0766, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0431, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0517, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0790, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0618, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0551, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0573, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0429, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0743, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0502, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0886, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0844, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0468, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0387, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0429, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0666, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0562, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0609, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0508, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0568, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0804, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0525, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0405, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0431, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0421, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0405, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0405, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0673, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0501, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0438, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0649, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0848, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0734, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0478, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0554, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0715, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0481, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0434, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0525, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0415, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0529, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0594, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0520, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0675, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0482, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0449, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0565, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0690, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0637, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0503, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0481, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0963, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0504, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0471, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0492, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0835, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0899, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0567, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0696, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0773, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0717, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0819, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0574, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0543, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0424, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0653, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0562, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0400, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0525, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0484, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0437, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0800, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0636, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0507, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0594, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0729, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0459, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0765, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0657, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0575, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0450, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0470, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0650, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0667, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0684, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0504, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0388, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0489, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0489, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0744, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0545, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0701, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0988, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0708, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0498, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0696, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0542, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0595, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0411, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0568, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0667, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0688, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0563, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0740, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0558, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0474, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0566, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0491, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0779, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0567, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0386, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0597, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0618, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0904, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0920, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0599, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0687, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0453, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0379, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0397, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0613, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0445, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0495, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0788, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0582, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0561, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0510, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0468, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0547, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0744, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0392, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0512, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0445, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0410, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0839, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0733, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0437, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0456, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0958, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0489, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0491, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0610, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0502, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0702, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0680, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0583, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0472, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0475, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0621, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0656, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0415, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0624, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0577, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0793, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0526, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0487, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0436, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0414, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0500, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0420, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0631, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0560, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0856, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0611, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0557, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0507, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0503, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0476, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0572, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0653, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0629, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0466, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0597, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0441, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0590, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0484, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1024, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0708, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0949, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0620, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0969, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0435, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0518, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0784, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0386, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0486, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0501, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0434, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0487, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0556, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0434, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0447, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0610, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0479, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0390, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0585, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0594, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0482, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0629, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0720, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0395, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0578, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0469, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0441, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0674, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0761, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0463, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0634, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0547, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0419, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0882, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0496, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0482, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0407, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0814, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0397, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0802, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0728, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0834, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0479, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0544, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0699, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0704, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0391, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0690, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0760, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0484, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0714, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0617, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0424, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0451, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0395, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0441, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0494, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0508, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0521, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0444, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0413, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0400, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0485, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0471, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0845, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0396, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0509, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0495, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0494, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0410, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0638, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0597, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0464, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0497, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0403, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0403, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0922, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0542, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0533, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0832, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0484, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0389, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0402, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0416, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0591, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0718, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0585, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0478, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0482, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0539, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0494, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0719, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0427, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0729, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0509, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0817, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0565, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0464, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0537, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0527, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0448, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0570, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0854, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0718, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0457, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0629, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0713, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0555, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0500, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0518, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0949, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0512, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0570, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0415, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0814, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0870, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0379, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0420, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0986, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0466, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0536, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0916, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0463, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0513, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0425, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0727, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0461, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0619, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0740, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0486, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0818, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0615, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0634, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0414, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0504, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0548, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0659, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0609, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0802, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0482, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0839, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0543, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0500, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0552, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0739, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0410, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0521, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0439, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0458, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0633, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0539, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0756, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0483, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0510, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0588, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0401, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0657, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0587, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0670, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0527, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0397, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0407, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0458, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0470, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0388, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0483, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0511, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0646, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0385, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0439, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0670, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0519, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0441, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0834, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0635, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0512, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0560, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0615, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1011, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0446, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0577, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0624, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0802, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0507, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0966, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0671, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0800, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0455, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0553, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0783, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0692, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0822, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0844, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0460, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0414, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0384, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0571, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0655, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0615, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0681, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0448, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0615, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0396, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0410, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0611, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0766, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0671, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0584, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0810, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0464, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0613, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0592, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0555, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0467, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0648, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0743, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0689, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0493, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0403, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0701, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0437, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0418, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0779, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0450, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0467, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0831, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0759, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0812, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0379, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0769, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0822, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0497, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0859, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0572, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0546, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0711, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0875, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0635, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0442, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0552, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0527, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0699, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0515, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0401, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0397, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0525, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0510, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0550, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0524, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0698, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0410, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0655, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0446, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0544, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0630, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0913, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0516, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0554, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0521, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0964, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0387, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0686, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0681, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0480, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0408, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0656, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0552, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0699, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0655, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0487, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0632, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0527, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0766, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0843, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0520, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0572, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0703, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0648, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0750, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0522, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0522, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0852, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0399, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0412, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0379, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0543, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0384, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0842, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0559, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0383, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0553, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0510, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0644, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0594, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0631, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0412, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0672, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0551, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0486, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0790, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0689, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0398, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0859, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0569, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0549, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0499, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0436, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0425, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0527, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0589, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0645, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0511, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0495, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0398, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0481, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0497, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0752, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0527, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0905, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0659, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0419, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0437, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0670, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0552, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0628, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0518, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0472, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0987, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0416, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0563, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0656, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0481, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0452, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0737, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0920, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0515, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0491, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0843, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0606, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0745, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0806, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0628, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0593, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0421, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0384, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0867, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0757, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0715, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0571, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0819, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0423, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0687, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0694, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0612, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0754, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0622, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0546, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0437, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0522, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0657, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0848, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0856, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0516, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0480, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0693, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0428, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0639, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0680, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0398, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0480, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0563, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0523, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0677, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0666, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0947, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0492, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0400, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0629, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0895, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0787, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0473, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0428, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0497, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0687, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0519, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0540, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0592, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0470, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0404, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0432, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0625, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0543, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0480, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0669, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0652, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0424, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0663, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0674, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0557, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0480, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0586, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0385, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0411, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0585, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0575, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0693, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0496, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0643, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0539, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0513, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0494, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0675, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0406, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0700, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0406, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0604, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0645, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0630, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0614, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0420, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0536, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0520, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0632, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0620, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0429, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0475, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0677, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0549, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0520, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0534, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0641, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0911, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0806, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0392, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0464, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0454, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0458, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0565, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0797, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0604, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0407, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0752, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0496, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0722, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0455, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1021, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0512, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0698, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0532, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0914, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0550, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0739, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0413, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0582, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0419, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0707, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0590, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0440, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0390, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0559, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0566, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0661, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0666, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0512, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0691, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0600, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0451, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0605, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0524, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0726, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0497, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0454, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0456, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0738, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0442, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0388, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0727, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0465, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0527, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0984, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0820, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0566, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0539, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0391, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0441, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0542, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0712, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0656, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0835, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0696, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0451, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0500, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0666, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0558, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0570, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0905, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0408, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0511, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0427, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0519, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0602, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0574, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0501, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0487, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0672, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0735, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0407, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0466, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0794, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0504, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0664, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0581, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0494, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0413, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0474, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0634, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0737, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0579, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0587, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0567, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0629, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0485, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0577, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0538, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0783, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0892, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0453, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0898, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0737, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0403, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0442, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0586, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0806, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0547, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0532, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0518, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0583, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0694, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0666, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0401, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0646, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1010, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0550, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0425, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0620, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0570, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0643, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0481, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0620, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0553, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0439, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0653, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0636, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0542, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0443, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0685, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0729, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0476, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0544, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0905, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0418, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0514, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0632, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0695, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0876, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0526, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0615, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0543, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0761, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0418, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0559, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0552, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0484, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0670, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0386, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0764, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0845, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0521, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0545, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0627, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0456, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0469, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0651, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0445, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0889, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0449, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0599, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0514, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0730, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0424, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0545, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0918, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0433, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0528, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0875, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0612, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0396, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0574, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0705, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0494, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0696, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0510, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0653, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0506, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0574, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0619, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0902, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0615, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0421, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0593, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0499, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0729, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0592, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0806, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0479, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0741, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0509, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0503, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0541, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0488, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0455, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0483, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0667, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0724, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0511, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0763, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0440, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0675, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0629, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0393, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0566, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0644, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0503, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0749, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0585, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0580, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0472, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0752, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0509, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0396, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0672, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0423, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0443, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0415, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0567, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0767, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0631, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0420, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0406, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0762, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0633, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0716, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0507, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0433, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0439, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0489, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0524, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0569, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0477, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0423, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0791, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0487, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0454, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0750, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0661, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0522, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0630, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0619, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0510, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0611, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0445, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0418, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0520, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0530, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0531, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0901, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0563, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0572, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0448, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0504, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0587, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0588, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0434, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0423, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0599, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0448, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0757, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0554, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0480, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0669, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0542, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0623, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0606, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0902, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0469, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0483, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0649, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0471, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0644, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0484, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0424, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0697, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0469, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0434, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0596, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0823, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0467, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0597, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0524, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0435, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0554, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0874, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0786, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0706, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0429, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0459, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0559, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0802, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0534, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0435, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0434, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0515, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0556, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0536, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0413, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0554, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0386, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0535, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0870, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0738, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0497, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0622, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0432, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0533, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0544, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0798, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0620, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0497, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0738, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0826, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0640, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0496, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0480, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0532, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0678, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0742, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0448, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0545, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0652, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0649, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0721, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0699, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0492, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0587, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0482, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0418, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0479, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0624, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0466, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0792, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0499, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0440, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0525, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0696, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0427, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0803, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0591, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0635, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0665, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1019, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0452, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0755, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0495, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0428, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0390, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0668, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0535, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0475, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0685, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0550, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0464, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0586, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0726, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0678, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0389, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0709, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0431, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0570, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0430, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0487, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0583, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0462, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0804, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0638, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0636, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0685, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0460, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0857, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0587, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0460, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0572, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0576, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0532, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0474, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0793, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0560, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0771, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0889, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0383, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0496, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0565, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0602, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0448, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0439, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0632, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0525, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0479, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0871, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0422, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0675, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0809, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0519, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0519, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0428, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0756, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0444, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0385, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0694, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0439, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0556, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0545, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0546, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0511, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0399, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0809, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0498, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0875, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0425, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0570, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0436, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0480, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0612, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0418, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0678, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0838, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0481, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0694, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0525, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0543, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0481, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0429, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0385, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0457, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0948, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0538, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0579, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0636, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0458, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0532, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0462, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0502, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0552, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0665, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0507, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0805, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0556, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0613, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0388, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0673, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0713, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0464, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0414, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0552, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0849, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1022, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0387, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0393, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0393, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0379, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0585, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0414, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0498, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0505, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0506, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0413, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0640, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0504, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0433, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0478, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0445, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0429, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0465, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1011, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0712, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0396, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0819, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0484, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0743, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0654, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0722, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0437, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0770, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0651, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0516, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0566, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0513, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0851, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0394, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0694, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0389, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0445, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0479, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0604, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0638, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0721, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0637, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0441, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0534, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0513, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0425, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0652, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0384, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0386, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0530, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0411, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0538, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0641, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0555, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0530, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0396, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0763, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0395, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0433, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0683, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0501, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0544, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0479, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0464, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0544, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0533, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0435, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0516, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0451, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0588, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0431, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0745, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0658, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0650, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0983, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0746, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0759, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0822, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0546, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0470, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0591, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0493, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0965, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0483, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0428, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0949, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0889, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0631, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0417, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0531, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0415, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0438, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0636, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0591, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0689, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0418, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0885, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0580, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0613, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0425, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0516, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0582, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0878, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0621, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0572, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0632, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0463, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0587, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0403, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0463, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0811, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0466, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0598, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0593, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0652, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0481, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0445, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0502, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0603, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0595, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0706, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0512, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0505, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0596, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0597, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0420, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0617, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0640, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0797, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0450, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0468, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0513, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0553, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0456, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0476, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0498, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0511, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0506, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0548, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0756, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0653, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0737, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0511, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0596, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0489, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0562, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0440, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0646, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0509, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0808, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0433, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0484, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0390, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0434, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0409, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0534, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0469, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0520, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0667, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0416, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0402, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0642, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0504, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0458, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0397, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0444, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0432, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0424, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0545, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0517, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0531, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0977, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0554, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0442, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0725, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0574, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0455, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0502, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0623, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0398, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0629, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0386, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0462, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0500, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0734, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0526, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0623, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0469, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0478, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0641, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0800, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0588, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0597, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0379, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0419, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0498, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0718, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0657, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0967, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0652, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0433, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0462, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0531, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0712, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0509, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0462, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0391, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0422, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0644, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0492, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0682, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0473, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0455, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0418, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0429, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0713, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0490, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0450, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0707, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0783, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0658, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0454, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0461, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0450, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0395, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0690, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0697, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0484, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0458, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0393, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0383, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0425, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0723, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0453, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0461, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0582, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0544, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0435, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0726, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0478, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0478, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0728, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0384, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0646, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0579, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0551, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0792, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0403, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0736, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0759, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0701, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0604, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0524, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0393, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0490, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0524, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0627, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0486, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0432, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0720, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0652, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0576, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0476, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0623, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0499, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0536, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0528, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0710, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0464, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0526, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0589, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0467, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0639, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0411, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0731, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0555, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1020, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0480, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0451, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0637, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0478, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0649, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0622, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0626, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0566, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0804, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0497, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0427, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0645, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0417, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0493, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0400, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0719, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0544, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0448, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0725, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0392, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0661, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0757, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0723, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0551, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0540, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0455, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0720, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0627, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0444, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0607, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0436, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0418, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0409, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0674, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0438, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0816, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0457, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0524, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0404, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0839, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0887, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0519, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0633, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0441, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0407, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0490, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0743, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0487, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0560, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0981, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0552, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0503, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0593, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0463, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0609, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0401, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0384, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0456, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0511, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0585, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0779, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0471, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0487, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0393, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0578, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0414, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0396, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0631, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0603, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0512, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0861, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0442, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0634, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0502, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0543, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0458, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0511, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0388, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0464, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0846, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0593, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0775, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0434, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0661, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0934, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0411, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0729, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0504, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0618, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0550, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0468, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0430, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0625, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0886, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0558, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0634, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0504, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0655, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0426, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0445, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0476, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0490, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0705, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0582, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0941, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0699, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0576, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0723, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0464, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0516, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0487, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0622, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0676, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0663, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0748, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0507, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0700, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0617, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0656, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0501, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0852, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0487, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0608, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0554, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0497, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0576, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0659, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0566, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0500, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0464, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0736, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0825, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0670, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0537, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0396, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0513, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0840, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0697, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0548, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0790, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0452, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0703, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0587, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0559, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0511, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0496, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0570, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0404, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0516, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0480, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0789, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0474, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0839, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0405, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0407, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0586, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0675, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0552, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0810, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0863, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0764, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0624, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0739, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0579, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0454, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0478, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0489, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0553, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0859, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0716, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0568, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0385, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0651, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0614, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0463, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0641, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0529, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0595, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0601, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0794, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0507, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0649, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0793, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0638, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0777, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0478, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0485, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0523, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0849, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0463, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0638, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0415, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0383, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0441, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0701, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0455, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0496, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0488, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0573, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0657, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0379, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0613, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0608, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0412, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0576, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0472, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0899, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0412, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0648, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0632, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0930, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0650, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0390, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0532, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0443, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0781, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0690, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0866, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0780, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0550, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0686, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0756, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0698, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0872, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0654, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0882, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0648, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0459, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0533, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0504, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1016, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0452, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0913, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0553, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0536, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0658, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0423, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0673, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0509, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0606, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0388, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0706, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0608, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0477, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0694, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0739, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0513, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0719, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0631, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0468, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0451, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0418, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0386, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0583, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0527, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0504, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0606, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0662, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0653, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0444, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0665, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0678, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0470, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0933, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0798, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0916, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0550, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0741, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0649, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0467, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0754, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0541, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0911, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0488, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0713, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0558, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0477, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0448, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0475, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0420, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0404, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0405, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0931, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0594, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0530, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0484, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0581, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0590, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0830, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0955, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0540, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0529, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0433, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0591, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0587, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0505, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0610, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0581, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0987, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0677, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0952, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0925, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0474, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0399, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0574, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0762, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0503, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0577, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0598, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0437, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0661, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0588, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0426, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0448, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0927, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0644, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0563, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0581, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0872, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0502, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0642, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0833, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0575, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0779, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0582, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0915, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0422, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0462, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0662, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0653, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0706, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0483, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0755, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0430, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0613, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0764, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0489, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0574, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0508, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0648, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0454, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0585, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0621, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0637, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0590, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0619, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0509, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0385, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0677, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0414, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0654, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0695, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0642, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0908, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0706, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0394, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0827, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0855, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0440, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0416, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0495, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0566, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0409, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0617, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0563, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0420, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0495, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0680, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0446, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0575, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0431, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0466, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0550, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0421, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0484, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0618, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0544, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1004, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0384, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0658, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0797, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0505, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0566, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0723, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0531, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0486, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0814, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0643, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0495, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0651, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0599, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0623, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0533, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0418, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0608, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0775, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0569, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0480, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0507, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0767, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0674, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0664, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0422, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0567, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0704, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0478, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0465, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0406, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0771, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0510, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0689, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0379, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0740, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0492, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0669, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0593, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0751, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0493, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0512, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0461, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0415, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0419, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0412, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0410, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0864, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0582, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0781, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0472, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0454, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0390, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0432, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0677, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0509, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0442, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0791, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0760, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0481, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0878, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0391, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0732, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0461, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0583, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0706, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0544, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0452, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0640, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0422, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0423, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0619, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0703, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0848, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0405, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0514, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0470, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0968, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0709, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0571, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0737, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0906, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0474, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0546, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0851, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0568, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0507, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0645, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0532, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0456, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0857, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0497, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0662, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0410, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0404, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0538, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0436, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0515, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0728, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0572, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0860, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0586, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0571, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0410, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0523, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0813, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0418, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0675, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0526, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0476, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0390, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0460, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0633, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0778, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0505, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0640, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0603, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0777, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0467, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0820, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0465, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0451, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0743, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0386, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0514, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0425, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0390, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0614, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0586, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0472, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0417, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0690, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0532, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0615, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0475, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0494, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0555, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0501, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0650, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0422, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0468, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0414, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0777, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0494, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0721, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0383, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0583, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0823, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0440, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0426, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0590, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0457, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0630, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0542, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0704, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0490, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0452, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0509, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0995, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0786, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0464, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0529, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0703, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0624, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0465, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0399, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0689, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0848, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0516, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0415, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0401, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0379, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0745, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0578, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0382, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0600, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0596, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0421, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0763, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0775, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0502, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0390, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0766, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0771, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0721, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0545, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0475, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0858, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0612, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0544, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0739, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0552, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0920, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0385, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0536, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0559, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0538, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0484, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0581, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0632, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0636, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0531, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0535, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0632, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0478, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0836, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0683, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0473, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0632, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0617, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0469, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0597, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0830, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0401, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0648, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0400, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0492, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0486, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0474, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0730, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0521, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0746, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0599, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0450, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0628, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0666, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0409, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0460, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0719, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0654, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0483, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0521, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0675, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0513, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0511, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0541, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0439, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0600, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0441, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0777, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0692, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0514, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0768, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0881, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0494, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0561, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0458, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0414, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0512, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0434, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0460, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0442, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0518, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0439, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0493, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0384, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0595, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0553, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0393, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0399, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0729, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0714, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0453, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0565, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0493, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0675, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0628, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0538, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0736, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0511, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0486, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0565, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0634, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0555, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0545, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0803, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1006, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0747, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0487, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0458, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0784, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0401, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0615, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0872, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0544, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0534, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0518, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0469, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0446, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0414, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0532, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0478, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0478, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0697, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0423, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0509, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0518, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0418, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0379, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0502, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0505, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0487, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0536, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0435, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0412, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0923, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0399, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0550, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0523, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0382, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0379, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0495, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0515, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0457, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0854, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0775, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0392, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0399, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0405, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0390, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0428, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0473, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0539, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0429, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0645, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0562, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0759, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0508, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0520, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0603, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0446, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0418, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0511, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0417, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0544, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0802, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0781, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0515, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0519, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0393, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0467, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0591, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0408, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0450, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0474, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0895, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0535, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0610, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0423, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0688, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0519, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0590, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0385, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0975, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0565, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0492, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0992, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0831, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0585, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0396, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0690, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0560, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0689, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0542, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0666, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0586, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0471, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0597, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0676, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0525, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0612, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0535, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0594, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0552, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0635, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0617, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0597, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0523, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0556, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0582, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0721, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0588, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0458, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0951, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0626, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0683, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0450, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0519, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0491, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0602, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0620, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0579, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0682, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0457, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0633, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0802, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0565, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0492, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0506, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0516, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0961, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0589, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0523, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0519, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0410, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0879, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0562, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0571, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0474, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0462, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0400, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0571, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0425, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0530, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0835, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0615, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0570, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0519, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0712, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0566, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0532, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0628, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0482, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0755, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0613, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0748, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1025, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0523, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0489, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0597, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0431, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0555, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0804, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0391, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0610, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0578, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0654, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0883, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0444, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0408, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0442, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0404, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0756, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0456, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0803, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0787, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0759, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0695, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0394, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0591, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0424, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0674, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0408, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0868, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0578, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0438, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0482, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0487, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0710, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0558, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0669, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0433, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0509, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0577, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0844, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0433, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0434, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0676, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0560, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0537, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0404, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0431, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0499, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0475, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0718, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0406, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0399, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0512, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0433, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0766, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0613, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0474, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0492, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0550, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0625, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0584, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0679, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0518, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0959, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0475, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0899, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0914, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0547, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0497, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0390, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0625, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0672, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0403, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0386, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0426, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0479, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0572, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0620, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0512, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0868, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0689, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0461, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0918, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0643, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0410, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0620, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0569, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0389, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0459, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0682, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0394, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0576, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0569, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0485, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0635, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0520, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0449, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0837, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0403, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0868, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0491, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0483, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0557, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0419, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0596, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0504, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0611, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0460, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0662, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0423, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0829, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0456, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0587, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0554, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0489, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0814, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0993, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0842, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0402, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0410, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0509, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0507, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0521, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0399, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0506, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0383, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0443, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0425, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0438, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0481, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0505, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0606, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0560, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0461, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0694, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0496, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0427, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0414, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0714, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0520, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0441, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0449, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0587, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0826, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0418, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0449, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0413, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0463, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0581, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0763, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0717, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0449, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0693, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0611, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0680, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0476, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0484, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0388, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0705, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0669, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0580, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0655, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0411, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0640, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0710, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0432, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0721, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0625, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0552, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0386, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0623, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0805, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0407, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0416, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0472, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0505, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0398, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0595, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0463, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0618, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0458, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0472, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0416, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0497, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0730, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0401, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0424, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0743, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0637, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0587, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0471, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0583, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0549, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0738, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0431, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0496, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0485, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0514, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0518, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0733, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0527, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0657, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0835, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0513, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0620, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0583, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0384, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0775, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0475, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0704, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0399, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0585, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0822, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0419, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0423, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0655, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0480, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0488, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0490, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0402, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0419, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0778, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0482, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0881, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0475, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0659, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0553, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0549, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0502, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0711, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0598, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0694, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0467, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0543, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0594, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0607, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0570, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0619, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0594, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0388, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0541, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0497, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0401, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0425, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0466, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0691, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0545, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0391, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0520, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0855, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0400, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0568, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0455, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0429, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0611, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0430, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0446, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0482, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0722, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0796, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0395, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0457, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0647, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0384, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0774, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0445, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0492, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0753, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0546, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0442, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0440, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0554, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0672, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0564, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0410, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0432, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0689, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0482, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0495, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0531, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0556, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0567, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0511, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0839, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0903, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0680, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0446, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0571, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0478, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0415, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0779, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0425, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0444, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0425, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0488, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0452, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0404, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0532, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0461, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0489, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0964, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0535, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0555, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0577, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0487, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0395, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0649, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0819, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0826, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0584, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0604, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0562, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0691, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0792, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0429, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0436, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0495, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0566, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0729, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0458, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0467, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0673, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0471, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0732, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0425, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0588, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0600, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0537, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0438, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0447, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0601, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0387, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0384, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0537, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0531, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0454, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0474, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0402, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0496, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0545, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0540, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0550, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0436, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0390, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0832, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0620, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0469, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0414, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0850, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0430, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0540, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0520, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0425, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0654, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0912, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0676, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0790, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0680, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0517, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0698, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0497, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0838, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0439, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0700, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0634, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0579, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0579, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0443, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0648, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0705, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0735, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0441, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0560, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0819, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0586, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0879, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0469, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0624, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0678, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0406, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0410, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0474, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0594, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0446, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0497, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0534, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0387, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0667, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0439, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0496, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0452, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0472, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0412, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0511, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0687, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0415, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0476, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0601, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0687, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0836, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0589, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0847, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0519, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0698, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0564, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0439, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0396, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0559, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0407, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0480, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0773, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0445, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0584, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1004, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0476, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0523, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0379, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0598, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0629, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0490, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0750, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0510, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0495, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1001, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0648, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0509, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0432, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0422, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0777, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0509, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0494, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0850, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0493, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0459, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0476, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0523, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0420, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0474, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0820, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0888, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0488, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0496, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0556, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0437, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0418, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0408, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0405, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0462, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0936, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0575, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0454, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0385, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0495, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0406, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0614, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0447, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0403, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0510, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0452, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0624, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0772, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0405, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0428, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0588, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0445, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0638, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0538, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0532, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0551, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0444, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0448, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0737, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0443, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0484, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1004, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0457, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0426, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0787, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0682, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0429, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0458, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0427, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0939, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0414, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0476, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0462, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0392, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0578, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0533, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0507, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0489, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0397, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0518, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0404, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0786, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0718, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0550, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0535, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0543, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0695, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0460, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0398, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0507, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0740, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0552, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0491, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0593, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0570, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0379, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0607, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0453, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0793, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0629, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0410, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0491, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0804, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0612, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0436, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0560, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0458, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0476, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0790, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0495, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0451, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0628, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0424, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0599, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0396, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0436, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0651, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0478, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0454, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0510, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0658, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0611, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0617, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0523, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0452, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0411, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0441, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0418, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0649, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0493, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0460, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0399, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0619, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0383, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0511, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0411, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0432, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0488, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0424, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0432, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0418, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0392, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0545, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0595, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0388, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0909, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0508, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0647, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0618, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0429, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0484, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0382, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0733, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0439, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0685, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0460, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0878, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0480, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0589, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0414, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0481, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0549, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0649, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0394, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0409, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0398, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0497, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0477, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0486, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0398, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0590, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0389, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0446, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0413, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0421, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0579, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0468, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0818, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0859, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0408, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0557, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0486, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0441, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0422, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0442, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0506, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0526, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0438, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0549, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0647, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0382, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0403, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0440, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0634, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0517, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0433, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0513, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0773, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0466, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0506, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0498, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0506, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0576, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0500, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0697, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0568, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0486, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0690, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0766, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0392, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0572, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0617, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0651, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0508, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0524, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0463, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0522, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0565, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0557, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0548, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0430, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0411, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0674, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0492, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0644, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0724, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0515, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0463, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0477, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0504, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0434, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0445, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0531, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0405, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0449, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0410, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0513, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0384, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0613, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0706, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0436, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0469, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0469, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0405, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0539, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0489, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0581, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0555, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0479, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0687, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0615, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0552, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0713, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0562, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0526, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0547, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0433, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0593, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0602, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0616, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0719, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0503, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0802, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0385, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0578, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0548, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0404, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0583, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0640, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0571, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0832, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0635, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0618, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0462, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0439, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0698, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0533, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0706, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0391, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0660, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0473, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0441, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0556, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0792, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0421, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0510, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0641, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0496, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0562, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0432, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0738, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0665, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0602, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0840, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0468, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0423, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0427, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0628, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0496, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0624, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0394, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0533, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0420, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0416, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0508, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0430, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0430, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0556, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0383, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0618, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0493, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0397, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0562, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0419, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0633, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0479, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0394, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0410, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0544, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0789, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0393, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0387, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0620, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0460, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0459, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0462, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0389, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0545, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0440, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0425, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0411, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0679, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0528, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0482, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0398, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0493, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0480, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0471, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0427, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0677, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0568, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0760, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0539, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0676, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0379, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0397, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0522, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0696, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0487, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0483, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0426, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0466, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0407, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0792, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0536, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0398, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0488, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0557, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0530, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0450, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0530, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0470, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0488, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0547, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0610, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0613, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0435, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0529, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0594, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0526, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0498, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0417, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0397, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0487, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0513, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0630, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0438, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0491, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0672, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0449, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0433, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0558, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0577, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0442, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0530, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0672, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0395, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0870, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0477, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0743, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0693, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0488, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0545, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0408, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0409, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0606, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0501, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0405, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0464, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0702, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0476, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0397, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0430, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0469, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0551, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0652, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0503, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0936, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0462, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0471, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0643, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0512, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0645, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0482, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0452, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0442, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0408, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0417, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0688, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0471, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0701, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0516, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0471, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0515, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0457, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0821, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0385, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0533, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0506, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0429, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0465, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0538, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0442, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0407, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0687, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0522, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0669, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0484, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0447, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0431, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0434, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0462, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0747, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0520, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0514, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0503, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0441, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0626, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0543, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0408, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0473, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0387, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0468, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0628, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0425, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0398, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0434, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0480, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0622, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0389, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0439, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0453, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0579, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0441, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0527, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0550, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0379, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0508, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0490, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0931, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0489, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0418, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0644, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0398, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0449, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0498, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0588, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0545, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0585, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0682, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0419, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0518, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0762, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0521, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0382, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0612, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0480, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0501, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0455, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0496, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0419, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0465, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0425, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0391, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0498, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0478, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0396, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0455, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0549, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0458, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0840, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0464, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0587, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0386, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0387, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0412, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0680, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0657, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0822, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0527, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0468, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0499, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0541, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0625, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0470, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0383, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0601, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0505, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0605, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0615, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0668, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0483, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0679, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0501, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0523, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0577, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0434, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0387, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0715, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0524, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0773, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0417, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0675, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0423, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0493, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0411, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0508, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0632, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0418, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0461, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0619, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0734, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0536, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0513, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0475, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0483, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0597, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0732, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0438, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0408, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0426, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0560, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0663, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0699, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0574, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0677, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0490, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0530, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0420, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0467, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0544, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0404, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0499, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0510, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0385, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0404, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0554, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0522, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0485, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0578, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0880, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0384, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0409, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0448, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0525, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0518, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0777, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0488, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0448, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0586, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0453, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0444, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0568, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0440, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0420, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0428, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0590, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0432, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0453, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0444, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0587, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0479, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0466, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0797, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0439, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0549, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0485, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0383, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0408, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0426, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0401, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0824, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0384, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0895, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0424, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0590, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0558, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0468, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0459, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0462, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0680, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0642, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0554, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0419, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0674, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0541, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0574, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0500, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0440, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0747, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0386, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0390, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0435, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0607, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0462, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0774, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0607, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0489, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0402, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0656, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0564, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0575, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0599, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0522, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0388, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0469, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0454, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0418, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0474, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0441, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0485, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0646, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0550, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0450, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0649, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0488, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0553, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0458, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0560, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0400, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0470, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0433, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0477, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0449, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0387, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0405, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0417, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0721, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0497, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0653, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0420, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0511, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0451, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0632, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0424, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0489, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0663, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0482, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0751, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0382, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0424, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0463, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0389, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0506, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0486, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0390, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0465, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0415, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0448, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0449, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0390, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0701, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0544, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0606, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0668, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0501, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0479, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0602, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0398, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0481, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0416, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0388, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0810, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0466, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0639, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0461, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0638, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0613, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0398, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0544, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0538, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0414, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0585, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0458, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0435, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0451, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0578, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0399, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0564, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0436, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0501, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0571, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0390, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0568, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0484, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0476, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0552, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0536, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0471, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0789, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0429, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0396, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0574, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0443, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0427, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0391, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0462, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0835, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0486, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0526, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0601, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0558, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0552, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0497, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0556, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0444, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0503, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0454, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0650, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0502, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0391, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0386, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0764, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0445, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0694, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0939, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0639, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0396, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0485, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0475, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0525, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0416, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0437, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0435, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0712, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0464, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0618, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0449, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0397, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0576, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0422, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0707, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0530, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0465, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0572, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0481, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0724, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0474, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0629, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0661, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0535, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0549, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0693, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0472, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0511, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0437, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0382, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0461, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0587, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0420, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0535, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0526, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0623, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0414, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0486, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0607, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0600, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0469, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0409, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0423, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0391, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0597, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0442, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0466, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0457, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0617, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0453, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0556, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0543, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0456, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0533, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0404, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0473, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0645, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0434, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0554, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0433, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0709, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0538, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0614, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0434, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0382, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0493, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0532, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0841, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0434, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0401, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0407, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0387, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0813, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0640, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0645, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0517, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0421, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0426, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0436, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0436, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0659, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0452, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0580, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0466, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0576, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0491, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0419, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0456, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0489, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0624, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0483, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0417, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0438, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0581, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0397, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0495, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0432, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0495, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0410, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0430, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0421, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0405, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0436, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0633, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0479, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0569, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0434, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0507, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0822, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0465, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0483, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0405, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0392, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0441, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0413, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0444, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0440, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0656, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0398, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0424, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0425, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0491, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0470, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0473, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0409, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0437, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0511, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0455, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0427, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0389, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0431, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0411, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0411, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0563, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0469, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0636, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0584, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0407, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0769, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0383, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0414, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0400, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0447, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0506, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0396, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0513, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0455, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0399, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0585, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0447, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0553, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0384, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0439, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0437, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0443, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0388, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0425, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0395, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0394, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0395, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0480, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0395, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0426, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0444, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0481, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0462, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0720, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0549, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0582, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0488, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0427, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0659, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0434, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0391, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0025, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0528, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0395, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0399, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0412, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0407, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0422, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0519, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0455, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0774, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0415, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0712, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0532, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0453, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0391, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0594, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0390, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0570, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0723, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0406, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0422, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0578, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0533, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0428, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0408, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0524, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0753, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0433, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0489, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0387, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0530, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0400, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0387, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0478, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0403, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0379, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0693, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0521, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0388, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0490, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0409, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0508, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0495, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0458, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0445, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0480, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0499, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0530, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0454, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0409, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0433, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0511, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0387, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0495, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0792, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0808, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0578, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0475, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0559, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0504, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0556, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0458, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0424, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0391, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0509, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0400, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0387, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0425, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0460, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0451, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0408, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0389, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0510, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0398, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0439, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0403, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0386, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0461, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0494, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0533, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0628, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0425, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0427, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0522, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0479, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0421, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0492, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0501, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0450, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0575, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0498, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0483, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0408, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0436, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0554, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0402, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0610, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0447, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0468, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0408, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0505, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0494, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0437, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0407, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0459, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0481, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0466, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0406, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0422, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0536, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0384, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0661, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0487, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0395, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0482, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0469, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0678, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0404, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0481, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0405, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0506, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0430, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0675, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0645, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0467, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0626, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0531, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0426, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0432, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0470, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0386, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0399, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0437, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0447, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0406, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0481, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0403, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0452, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0421, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0496, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0536, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0630, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0519, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0386, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0464, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0547, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0454, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0573, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0379, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0394, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0756, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0551, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0432, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0410, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0586, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0647, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0403, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0421, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0439, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0435, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0520, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0408, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0394, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0548, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0648, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0458, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0495, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0405, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0451, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0403, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0435, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0685, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0507, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0513, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0516, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0448, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0473, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0906, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0448, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0495, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0415, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0515, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0473, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0738, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0396, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0450, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0459, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0457, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0723, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0406, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0604, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0519, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0447, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0582, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0603, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0410, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0416, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0430, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0397, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0426, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0417, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0382, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0434, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0468, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0427, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0569, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0555, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0447, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0430, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0414, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0501, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0528, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0562, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0519, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0786, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0383, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0399, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0393, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0519, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0608, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0430, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0592, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0501, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0476, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0438, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0475, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0393, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0432, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0567, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0440, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0425, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0422, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0494, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0020, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0455, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0437, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0521, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0499, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0428, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0412, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0512, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0410, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0403, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0419, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0398, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0504, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0424, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0482, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0392, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0514, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0562, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0578, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0538, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0398, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0402, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0427, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0699, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0603, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0555, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0503, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0592, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0564, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0674, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0683, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0443, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0447, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0392, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0665, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0426, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0384, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0590, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0415, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0517, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0428, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0517, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0550, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0777, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0864, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0398, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0571, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0461, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0610, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0428, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0420, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0479, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0453, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0472, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0383, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0458, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0520, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0426, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0494, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0454, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0392, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0527, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0622, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0445, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0603, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0437, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0450, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0640, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0507, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0503, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0473, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0747, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0423, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0451, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1013, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0410, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0482, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0397, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0423, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0383, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0449, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0547, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0441, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0432, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0634, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0684, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0467, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0389, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0392, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0562, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0525, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0400, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0433, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0495, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0433, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0528, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0012, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0393, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0418, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0394, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0620, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0384, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0478, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0427, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0430, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0464, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0471, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0426, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0590, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0464, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0395, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0537, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0455, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0460, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0465, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0409, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0569, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0402, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0454, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0413, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1012, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0596, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0397, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0633, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0516, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0456, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0845, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0462, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0412, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0394, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0508, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0387, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0421, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0410, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0415, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0582, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0446, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0505, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0403, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0401, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0385, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0403, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0450, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0562, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0542, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0395, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0461, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0400, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0435, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0392, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0740, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0450, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0421, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0458, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0396, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0005, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0438, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0484, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0427, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0441, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0016, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0513, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0398, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0498, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0628, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0578, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0395, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0025, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0500, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0388, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0435, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0534, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0445, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0546, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0438, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0444, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0663, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0441, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0656, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0468, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0530, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0481, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0395, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0415, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0473, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0467, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0412, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0743, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0896, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0436, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0660, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0510, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0511, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0412, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0474, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0413, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0422, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0410, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0408, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0390, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0432, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0602, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0450, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0439, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0394, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0518, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0498, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0018, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0406, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0532, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0637, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0494, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0393, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0490, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0469, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0470, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0461, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0496, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0468, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0500, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0501, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0451, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0455, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0503, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0392, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0407, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0417, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0467, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0415, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0386, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0582, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0505, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0382, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0394, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0471, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0428, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0453, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0413, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0657, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0407, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0552, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0517, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0482, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0669, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0546, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0463, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0445, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0589, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0379, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0414, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0407, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0410, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0433, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0476, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0472, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0020, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0452, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0399, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0428, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0410, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0021, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0469, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0533, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0565, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0452, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0779, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0706, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0394, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0383, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0499, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0458, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0394, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0461, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0455, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0419, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0474, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0490, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0650, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0574, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0393, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0421, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0415, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0433, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0432, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0386, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0424, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0440, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0485, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0429, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0570, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0552, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0428, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0448, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0464, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0405, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0451, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0479, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0477, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0443, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0560, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0395, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0388, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0613, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0477, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0535, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0450, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0398, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0429, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0397, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0563, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0530, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0708, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0424, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0500, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0450, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0426, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0426, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0416, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0549, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0418, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0483, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0483, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0500, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0545, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0572, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0425, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0557, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0505, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0436, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0473, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0448, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0480, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0386, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0431, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0563, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0413, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0449, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0497, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0439, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0624, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0435, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0424, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0441, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0504, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0390, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0703, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0469, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0565, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0577, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0478, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0431, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0400, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0398, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0428, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0505, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0565, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0386, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0421, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0518, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0472, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0016, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0436, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0500, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0693, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0419, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0450, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0424, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0421, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0458, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0417, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0499, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0423, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0538, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0409, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0480, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0540, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0397, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0490, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0423, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0439, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0430, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0523, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0407, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0414, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0441, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0388, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0465, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0437, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0452, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0465, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0435, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0435, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0398, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0389, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0424, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0379, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0455, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0553, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0436, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0422, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0401, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0506, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0488, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0435, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0423, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0397, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0414, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0571, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0389, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0499, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0471, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0545, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0496, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0482, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0384, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0459, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0498, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0711, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0482, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0007, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0389, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0501, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0452, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0559, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0470, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0531, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0398, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0456, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0487, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0409, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0596, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0022, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0408, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0554, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0426, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0506, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0410, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0423, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0405, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0389, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0591, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0461, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0522, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0526, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0525, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0523, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0397, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0458, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0488, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0562, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0571, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0385, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0414, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0477, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0468, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0445, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0456, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0484, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0537, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0802, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0493, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0552, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0460, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0510, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0715, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0401, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0511, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0404, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0481, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0447, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0385, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0396, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0399, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0006, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0523, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0461, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0385, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0441, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0471, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0587, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0465, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0408, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0441, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0437, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0391, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0456, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0397, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0386, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0446, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0489, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0528, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(8.4311e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0415, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0589, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0589, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0512, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0687, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0434, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0620, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0473, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0455, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0496, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0394, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0492, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0548, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0487, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0423, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0448, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0429, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0476, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0453, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0464, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0379, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0419, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0487, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0661, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0624, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0568, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0509, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0442, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0590, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0464, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0492, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0611, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0577, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0502, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0387, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0472, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0555, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0404, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0411, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0533, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0387, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0428, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0439, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0452, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0833, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0520, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0379, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0666, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0466, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0574, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0512, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0532, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0422, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0481, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0505, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0570, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0420, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0433, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0409, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0398, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0554, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0487, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0479, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0737, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0713, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0538, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0483, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0502, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0440, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0463, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0407, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0602, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0439, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0650, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0404, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0452, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0389, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0506, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0485, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0474, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0383, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0439, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0406, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0557, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0387, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0424, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0519, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0398, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0423, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0517, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0737, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0422, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0433, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0569, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0590, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0475, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0459, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0421, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0507, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0705, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0485, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0511, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0393, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0444, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0450, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0398, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0395, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0489, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0431, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0523, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0407, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0408, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0386, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0461, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0626, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0479, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0445, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0533, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0522, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0402, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0457, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0546, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0451, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0592, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0451, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0439, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0389, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0562, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0417, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0419, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0520, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0379, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0408, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0420, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0488, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0527, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0569, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0557, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0422, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0567, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0022, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0462, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0476, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0472, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0399, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0636, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0405, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0457, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0549, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0416, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0453, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0383, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0587, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0433, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0395, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0406, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0421, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0522, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0412, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0487, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0546, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0395, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0404, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0434, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0408, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0633, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0451, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0522, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0509, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0396, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0390, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0490, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0491, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0462, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0585, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0474, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0404, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0402, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0459, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0424, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0512, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0391, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0448, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0392, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0535, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0488, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0430, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0599, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0646, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0404, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0023, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0594, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0457, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0472, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0495, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0557, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0382, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0749, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0498, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0580, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0705, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0532, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0508, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0441, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0419, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0401, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0467, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0590, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0396, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0424, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0445, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0383, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0544, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0432, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0639, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0534, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0016, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0396, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0705, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0413, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0519, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0495, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0449, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0553, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0408, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0550, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0459, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0389, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0605, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0014, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0412, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0451, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0498, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0388, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0452, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0438, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0401, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0489, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0388, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0437, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0517, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0581, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0387, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0472, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0386, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0718, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0407, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0398, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0654, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0416, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0475, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0523, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0541, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0461, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0385, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0538, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0478, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0389, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0445, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0456, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0497, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0434, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0395, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0453, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0432, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0594, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0468, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0529, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0413, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0510, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0444, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0396, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0428, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0474, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0576, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0713, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0421, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0451, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0393, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0531, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0386, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0403, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0395, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0388, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0487, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0474, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0530, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0545, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0392, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0472, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0559, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0468, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0529, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0473, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0523, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0443, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0504, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0448, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0406, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0399, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0427, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0509, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0491, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0519, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0388, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0473, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0470, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0394, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0430, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0448, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0404, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0392, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0417, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0420, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0441, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0432, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0497, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0397, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0479, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0629, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0579, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0581, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0626, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0412, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0493, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0722, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0386, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0537, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0429, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0478, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0421, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0604, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0520, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0427, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0395, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0805, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0390, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0405, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0417, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0447, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0390, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0505, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0418, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0394, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0525, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0385, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0473, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0412, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0410, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0577, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0382, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0483, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0419, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0496, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0478, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0868, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0566, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0512, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0577, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0516, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0427, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0578, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0423, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0385, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0483, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0447, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0401, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0504, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0592, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0417, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0383, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0634, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0402, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0449, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0448, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0421, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0402, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0694, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0464, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0395, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0021, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0480, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0451, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0400, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0389, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0010, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0508, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0379, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0483, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0541, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0399, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0561, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0544, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0578, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0438, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0395, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0420, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0589, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0399, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0449, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0400, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0674, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0620, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0408, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0446, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0450, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0396, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0786, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0406, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0447, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0624, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0534, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0391, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0485, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0389, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0487, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0454, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0530, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0674, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0384, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0395, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0584, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0393, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0402, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0836, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0589, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0482, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0521, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0632, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0405, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0527, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0481, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0396, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0383, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0500, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0499, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0494, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0399, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0414, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0466, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0024, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0905, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0409, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0024, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0525, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0449, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0394, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0679, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0976, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0408, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0403, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0429, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0516, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0484, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0016, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0386, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0388, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0414, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0423, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0810, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0437, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0413, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0482, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0597, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0910, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0401, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0409, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0472, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0394, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0414, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0387, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0400, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0465, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0422, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0409, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0434, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0494, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0448, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0434, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0540, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0452, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0467, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0520, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0555, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0439, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0399, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0392, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0404, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0504, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0717, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0008, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0510, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0386, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0549, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0443, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0011, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0386, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0556, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0444, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0600, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0600, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0834, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0391, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0539, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0460, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0449, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0531, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0601, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0698, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0393, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0412, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0606, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0448, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0855, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0440, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0537, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0438, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0412, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0504, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0427, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0488, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0408, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0528, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0023, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0395, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0704, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0404, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0470, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0454, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0506, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0469, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0539, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0682, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0384, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0618, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0386, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0443, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0415, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0790, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0432, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0519, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0435, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0397, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0474, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0499, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0651, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0443, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0481, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0666, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0479, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0385, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0402, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0476, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0426, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0445, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0453, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0529, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0451, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0570, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0432, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0012, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0806, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0393, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0593, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0465, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0556, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0515, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0553, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0405, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0483, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0767, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0562, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0528, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0014, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0404, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0433, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0483, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0395, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0522, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0422, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0478, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0602, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0399, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0589, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0416, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0396, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0547, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0590, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0445, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0517, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0451, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0578, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0385, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0412, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0431, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0478, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0435, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0383, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0416, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0491, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0396, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0537, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0615, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0394, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0425, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0017, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0379, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0787, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0538, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0656, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0529, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0536, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0793, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0382, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0405, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0444, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0413, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0511, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0820, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0534, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0464, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0455, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0522, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0641, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0431, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0391, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0736, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0382, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0422, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0418, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0387, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0423, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0386, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0393, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0423, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0491, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0384, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0457, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0382, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0501, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0452, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0415, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0402, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0654, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0518, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0433, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0602, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0874, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0432, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0453, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0474, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0426, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0411, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0440, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0422, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0506, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0651, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0397, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0455, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0488, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0514, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0464, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0441, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0705, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0399, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0694, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0534, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0732, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0476, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0696, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0459, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0416, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0816, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0559, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0509, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0421, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0486, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0597, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0533, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0576, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0429, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0498, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0829, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0532, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0476, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0493, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0383, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0404, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0434, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0498, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0429, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0512, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0495, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0459, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0662, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0743, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0390, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0453, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0443, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0520, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0504, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0421, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0416, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0404, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0916, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0022, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0386, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0403, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0416, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0511, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0430, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0422, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0461, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0498, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0414, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0457, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0435, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0460, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0604, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0409, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0465, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0016, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0563, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0554, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0417, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0567, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0383, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0402, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0466, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0419, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0537, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0016, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0419, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0405, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0500, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0774, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0399, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0520, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0554, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0448, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0435, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0518, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0434, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0403, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0444, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0432, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0435, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0405, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0475, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0819, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0434, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0399, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0474, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0437, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0477, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0551, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0674, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0419, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0483, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0455, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0522, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0395, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0435, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0534, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0416, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0430, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0553, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0464, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0388, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0426, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0908, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0425, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0454, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0406, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0396, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0732, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0403, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0467, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0446, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0393, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0430, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0490, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0451, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0420, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0555, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0445, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0489, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0429, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0572, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0523, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0490, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0509, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0738, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0482, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0802, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0508, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0386, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0421, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0460, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0394, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0399, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0416, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0418, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0446, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0731, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0795, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0438, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0424, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0490, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0436, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0014, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0392, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0762, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0633, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0403, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0393, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0869, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0824, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0430, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0398, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0400, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0605, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0470, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0499, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0644, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0513, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0423, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0907, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0390, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0458, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0434, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0790, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0488, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0639, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0495, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0499, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0518, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0428, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0432, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0420, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0630, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0438, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0425, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0428, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0412, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0423, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0431, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0398, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0388, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0750, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0640, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0443, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0429, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0512, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0388, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0529, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0383, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0462, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0741, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0457, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0495, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0394, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0390, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0399, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0609, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0602, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0483, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0412, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0451, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0394, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0592, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0386, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0403, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0524, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0417, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0439, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0640, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0563, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0391, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0642, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0470, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0542, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0665, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0469, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0403, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0495, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0388, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0015, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0717, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0533, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0403, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0644, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0410, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0425, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0430, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0820, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0396, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0794, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0419, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0406, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0451, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0446, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0430, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0568, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0448, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0401, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0440, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0391, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0431, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0485, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0507, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0432, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0494, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0548, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0440, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0706, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0388, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0012, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0419, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0820, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0478, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0524, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0477, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0388, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0408, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0387, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0414, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0418, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0638, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0384, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0463, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0626, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0431, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0497, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0524, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0435, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0018, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0506, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0502, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0555, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0536, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0501, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0500, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0668, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0480, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0514, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0452, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0908, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0529, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0397, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0504, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0412, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0395, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0396, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0517, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0412, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0450, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0409, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0394, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0383, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0541, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0449, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0481, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0519, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0414, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0388, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0422, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0575, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0707, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0427, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0674, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0553, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0495, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0508, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0512, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0388, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0426, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0447, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0511, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0445, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0399, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0393, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0483, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0430, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0408, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0512, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0654, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0470, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0574, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0506, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0483, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0531, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0399, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0387, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0406, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0444, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0421, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0011, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0585, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0403, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0465, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0584, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0773, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0473, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0399, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0459, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0382, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0626, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0388, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0436, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0563, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0014, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0432, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0620, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0426, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0538, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0492, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0446, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0796, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0556, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0514, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0421, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0480, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0513, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0446, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0605, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0478, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0434, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0405, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0757, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0527, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0511, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0440, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0401, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0415, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0505, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0499, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0517, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0422, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0394, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0490, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0545, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0625, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0636, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0550, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0574, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0467, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0540, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0414, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0385, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0748, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0424, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0479, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0410, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0430, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0488, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0440, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0475, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0433, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0505, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0538, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0517, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0638, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0433, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0021, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0425, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0583, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0491, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0429, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0439, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0424, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0647, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0470, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0389, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0018, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0509, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0459, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0015, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0512, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0573, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0635, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0502, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0397, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0425, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0588, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0429, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0550, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0450, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0427, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0592, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0564, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0443, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0713, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0389, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0385, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0393, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0436, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0451, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0512, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0419, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0441, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0406, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0397, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0535, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0401, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0382, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0435, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0425, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0506, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0442, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0489, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0493, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0557, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0478, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0379, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0435, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0513, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0483, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0571, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0387, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0459, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0437, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0578, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0393, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0483, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0449, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0725, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0661, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0606, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0416, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0399, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0490, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0402, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0535, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0552, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0598, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0410, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0434, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0472, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0607, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0548, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0501, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0440, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0416, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0536, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0430, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0525, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0515, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0638, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0473, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0384, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0537, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0415, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0451, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0820, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0422, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0733, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0407, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0635, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0411, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0602, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0583, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0386, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0390, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0423, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0501, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0426, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0491, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0396, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0414, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0589, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0480, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0392, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0499, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0516, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0573, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0422, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0540, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0450, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0617, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0633, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0383, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0696, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0386, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0473, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0423, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0455, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0447, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0427, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0565, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0406, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0544, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0394, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0421, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0465, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0497, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0585, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0441, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0391, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0493, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0396, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0514, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0602, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0390, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0427, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0453, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0387, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0507, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0462, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0379, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0440, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0422, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0430, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0623, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0421, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0406, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0519, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0633, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0705, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0019, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0428, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0429, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0446, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0399, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0413, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0600, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0493, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0395, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0503, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0465, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0428, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0385, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0402, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0394, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0399, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0457, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0439, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0386, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0593, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0493, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0409, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0661, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0463, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0850, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0534, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0481, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0417, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0471, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0682, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0533, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0449, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0448, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0398, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0422, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0455, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0413, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0669, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0553, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0424, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0467, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0410, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0429, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0440, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0418, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0459, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0450, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0427, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0426, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0449, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0486, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0534, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0506, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0456, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0453, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0394, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0449, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0453, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0444, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0428, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0502, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0431, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0403, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0476, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0403, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0502, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0525, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0474, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0552, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0504, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0518, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0392, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0382, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0414, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0414, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0409, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0636, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0410, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0452, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0439, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0512, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0497, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0876, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0449, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0409, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0413, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0391, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0388, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0432, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0394, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0419, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0414, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0383, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0420, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0450, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0469, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0393, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0599, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0382, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0472, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0500, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0496, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0749, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0441, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0470, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0425, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0466, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0495, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0394, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0396, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0580, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0661, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0450, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0389, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0441, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0539, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0528, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0473, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0542, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0410, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0400, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0490, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0497, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0426, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0465, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0471, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0561, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0462, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0415, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0407, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0457, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0383, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0465, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0476, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0394, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0417, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0437, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0501, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0423, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0707, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0629, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0466, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0405, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0477, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0444, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0497, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0438, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0641, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0414, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0394, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0536, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0486, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0473, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0547, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0400, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0435, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0391, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0426, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0468, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0428, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0419, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0440, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0558, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0393, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0448, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0441, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0499, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0418, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0387, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0403, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0507, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0515, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0382, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0452, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0583, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0534, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0531, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0395, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0482, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0479, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0474, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0554, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0417, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0450, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0462, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0526, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0425, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0476, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0402, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0534, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0483, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0389, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0421, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0482, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0423, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0450, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0018, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0393, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0401, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0398, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0521, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0485, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0387, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0455, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0468, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0407, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0438, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0425, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0436, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0414, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0628, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0452, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0593, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0448, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0540, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0603, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0379, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0429, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0488, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0503, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0436, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0478, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0647, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0479, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0468, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0385, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0439, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0392, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0419, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0417, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0693, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0389, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0502, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0615, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0542, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0569, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0403, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0386, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0422, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0805, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0431, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0469, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0397, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0395, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0422, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0518, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0648, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0500, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0454, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0020, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0422, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0457, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0394, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0472, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0456, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0817, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0414, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0552, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0472, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0533, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0488, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0690, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0500, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0389, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0670, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0409, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0405, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0598, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0578, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0442, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0379, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0413, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0719, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0620, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0395, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0393, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0489, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0504, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0676, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0535, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0386, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0507, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0489, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0519, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0439, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0498, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0559, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0415, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0544, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0441, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0464, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0589, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0637, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0584, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0515, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0539, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0677, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0605, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0433, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0429, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0459, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0468, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0472, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0610, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0439, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0479, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0402, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0446, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0530, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0401, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0420, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0399, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0431, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0522, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0469, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0482, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0626, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0398, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0538, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0445, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0435, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0421, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0478, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0402, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0392, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0544, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0479, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0393, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0385, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0475, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0416, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0393, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0414, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0427, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0433, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0476, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0411, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0379, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0729, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0386, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0453, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0519, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0464, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0389, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0507, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0405, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0492, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0450, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0544, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0395, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0458, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0422, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0529, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0434, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0475, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0500, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0385, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0423, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0471, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0535, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0534, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0383, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0637, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0677, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0448, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0447, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0390, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0570, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0385, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0523, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0388, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0424, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0418, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0564, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0506, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0452, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0533, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0384, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0581, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0392, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0409, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0400, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0546, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0404, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0409, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0417, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0441, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0399, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0386, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0403, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0395, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0467, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0392, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0392, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0439, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0422, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0407, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0388, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0587, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0456, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0479, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0415, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0635, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0453, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0391, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0389, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0480, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0388, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0522, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0448, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0397, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0400, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0477, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0471, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0474, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0542, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0388, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0475, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0578, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0532, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0383, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0433, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0421, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0396, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0459, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0384, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0448, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0495, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0570, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0400, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0398, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0409, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0554, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0391, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0408, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0625, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0528, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0400, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0474, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0417, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0419, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0501, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0436, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0411, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0437, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0501, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0427, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0490, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0512, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0385, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0646, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0422, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0634, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0539, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0510, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0403, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0420, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0635, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0433, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0478, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0420, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0490, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0404, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0491, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0383, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0589, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0637, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0410, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0723, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0579, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0462, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0387, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0437, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0487, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0571, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0649, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0399, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0474, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0014, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0485, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0457, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0514, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0465, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0387, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0396, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0486, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0437, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0492, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0468, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0456, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0384, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0403, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0422, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0444, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0415, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0533, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0448, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0610, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0480, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0387, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0430, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0505, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0464, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0619, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0649, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0388, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0392, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0450, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0481, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0025, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0516, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0409, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0499, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0582, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0479, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0725, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0393, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0425, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0464, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0431, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0580, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0476, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0441, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0467, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0503, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0575, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0535, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0390, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0491, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0400, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0687, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0674, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0400, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0438, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0382, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0528, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0386, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0609, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0491, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0462, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0562, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0453, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0516, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0404, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0504, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0415, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0396, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0392, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0492, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0436, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0426, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0498, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0445, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0496, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0447, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0441, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0449, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0576, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0634, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0699, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0417, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0467, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0566, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0442, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0402, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0700, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0527, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0465, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0428, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0379, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0511, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0505, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0511, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0419, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0821, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0624, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0634, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0685, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0537, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0453, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0418, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0441, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0386, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0397, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0443, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0476, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0568, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0482, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0506, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0435, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0398, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0548, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0501, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0561, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0447, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0410, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0490, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0480, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0527, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0525, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0529, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0391, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0698, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0435, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0431, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0484, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0644, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0503, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0589, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0476, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0404, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0409, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0461, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0457, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0429, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0504, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0448, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0416, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0512, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0465, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0414, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0424, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0395, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0400, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0552, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0451, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0448, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0390, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0397, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0526, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0508, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0537, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0522, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0403, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0426, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0474, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0394, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0452, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0404, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0509, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0397, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0424, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0450, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0802, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0389, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0476, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0414, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0389, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0695, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0427, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0471, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0578, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0397, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0449, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0411, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0425, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0460, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0395, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0515, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0628, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0566, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0974, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0410, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0484, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0455, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0624, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0540, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0008, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0015, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0424, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0503, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0410, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0579, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0444, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0636, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0544, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0520, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0664, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0425, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0508, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0498, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0492, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0621, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0410, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0383, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0393, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0477, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0488, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0386, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0460, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0423, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0387, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0429, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0423, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0470, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0393, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0506, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0547, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0529, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0436, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0383, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0510, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0501, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0457, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0606, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0436, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0540, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0457, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0412, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0018, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0471, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0409, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0435, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0389, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0424, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0399, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0652, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0432, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0634, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0537, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0430, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0480, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0670, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0404, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0419, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0462, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0491, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0503, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0015, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0522, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0585, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0427, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0404, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0418, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0009, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0515, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0611, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0389, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0388, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0463, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0416, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0411, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0486, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0418, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0409, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0498, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0017, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0534, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0601, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(7.0445e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0018, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0461, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0635, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0475, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0395, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0507, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0570, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0487, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0476, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0480, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0396, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0450, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0023, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0389, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0506, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0578, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0410, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0379, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0559, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0420, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0760, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0601, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0387, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0416, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0402, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0394, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0428, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0423, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0411, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0471, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0480, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0414, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0486, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0457, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0466, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0424, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0418, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(7.2540e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0391, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(7.1447e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0004, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0009, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0485, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0659, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0608, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0696, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0430, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0508, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0527, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0542, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0614, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0384, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0408, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0467, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0470, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0010, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0024, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0438, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0454, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0440, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0390, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0391, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0443, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0392, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0387, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0410, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0819, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0413, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0427, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0385, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(6.0192e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0429, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0408, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0394, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0379, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0530, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0407, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0476, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0012, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0400, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(6.7759e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0019, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0015, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0454, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0723, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0432, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0720, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0482, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0482, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0434, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0405, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0586, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0609, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0548, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0434, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0485, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(6.0988e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0429, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0457, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0379, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0695, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0017, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0480, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(6.0598e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0505, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0488, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0513, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0609, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(5.6943e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0022, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0024, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0020, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0826, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0439, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0425, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0508, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0405, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0524, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0007, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0577, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0805, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0478, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0410, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0505, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0391, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0588, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0010, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0424, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0394, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0504, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0399, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0514, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(6.7995e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(5.8383e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0456, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0426, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0441, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0468, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0384, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0387, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0451, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0650, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0447, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0015, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(6.3723e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0024, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(6.1365e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(6.7982e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0007, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0694, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0522, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(5.8331e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(5.7159e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0008, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0449, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0429, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0439, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0585, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0528, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0519, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0419, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0513, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0414, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0447, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0399, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0618, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0646, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0411, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0421, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0392, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0568, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(5.8418e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0010, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(5.8052e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0438, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0005, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(5.8137e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0527, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0016, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0382, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0406, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0800, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0012, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0675, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0394, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0382, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0402, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0386, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0405, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0621, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(6.4916e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0384, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0025, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0398, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0661, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0003, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(5.9152e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(6.0994e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0015, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0009, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(5.9073e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0002, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0549, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(6.2960e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0009, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0005, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0701, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0429, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0416, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0417, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0410, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0387, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0454, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0444, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0427, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0501, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0433, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0449, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0488, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0628, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0408, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0395, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0545, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0533, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0427, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0401, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0514, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0553, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0430, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0420, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0408, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0991, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0582, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0402, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0469, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0556, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0425, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0477, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0481, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0015, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(6.2583e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0019, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0020, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0024, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0488, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(6.1617e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(6.0870e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0426, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0397, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0531, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0536, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0548, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0387, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0405, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0418, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0476, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0623, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0586, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0395, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0462, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0522, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0423, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0392, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0519, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0412, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0400, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0002, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0385, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0419, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0504, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0466, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0003, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0433, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0454, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0491, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0386, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0392, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0618, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0512, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0413, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0576, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0814, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0541, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0413, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0013, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0440, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0439, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0441, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0569, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0569, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0432, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0614, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0436, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0487, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0484, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0485, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0461, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0723, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0765, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0577, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0429, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0519, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0471, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0384, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0449, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0613, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0559, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0463, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0436, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0473, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0421, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0484, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0025, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0518, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0553, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0691, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0422, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0402, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0453, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0400, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0473, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0472, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0465, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0006, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0411, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0390, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0573, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0527, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0520, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0554, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0393, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0402, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0504, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0383, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0437, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0432, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0014, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0493, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0498, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0497, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0460, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0548, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0460, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0618, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0491, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0545, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0433, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0710, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0501, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0503, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0488, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0611, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0023, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0413, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0023, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0396, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0416, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0665, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0537, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0576, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0562, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0665, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0500, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0702, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0387, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0389, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0479, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0468, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0414, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0571, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0643, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0418, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0457, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0442, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0477, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0443, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0563, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0393, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0547, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0432, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0430, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0598, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0555, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0454, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0419, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0551, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0832, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0397, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0548, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0404, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0409, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0395, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0493, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0658, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0847, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0481, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0384, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0386, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0562, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0510, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0515, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0432, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0580, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0470, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0404, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0386, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0421, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0385, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0017, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0476, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0607, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0479, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0526, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0662, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0580, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0421, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0491, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0575, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0401, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0452, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0487, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0647, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0465, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0405, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0459, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0533, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0462, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0439, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0385, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0448, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0382, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0417, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0424, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0895, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0622, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0489, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0541, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0545, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0599, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0535, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0433, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0414, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0728, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0489, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0395, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0432, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0520, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0501, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0391, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0486, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0428, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0560, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0421, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0643, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0388, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0506, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0540, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0395, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0430, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0384, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0384, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0666, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0520, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0467, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0391, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0461, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0432, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0509, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0540, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0018, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0489, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0523, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0398, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0452, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(9.3991e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0383, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0478, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0417, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0005, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0384, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0715, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0868, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0389, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0550, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0452, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0540, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0555, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0391, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0412, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0551, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0388, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0508, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0018, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0009, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0010, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0548, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0023, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0389, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0478, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0517, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0459, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0580, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0520, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0596, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0442, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0671, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0517, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0572, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0487, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0484, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0418, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0396, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0456, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0423, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0591, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0491, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0476, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0529, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0006, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0017, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0448, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(4.1676e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0382, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0402, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0422, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0509, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0532, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0481, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0379, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0396, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0431, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0463, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0422, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0799, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0406, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0469, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0421, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0383, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0501, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0447, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0773, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0493, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0641, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0920, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0403, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0418, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0465, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0910, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0618, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0501, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0008, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0015, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0459, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0496, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0430, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0455, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0451, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0474, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0414, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0549, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0387, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0404, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0564, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0714, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0417, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0420, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0557, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0401, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0724, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0427, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0433, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0561, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0689, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0441, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0002, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0477, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0020, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(3.4065e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(3.2550e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0021, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0008, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0009, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0664, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0502, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0658, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0561, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0005, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0014, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0493, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0477, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0415, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0430, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0387, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0478, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0399, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0587, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0440, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0458, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0402, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0487, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0393, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0416, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0018, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0420, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0400, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0596, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0406, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0518, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0422, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0414, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0436, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0390, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0454, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0611, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0451, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0407, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0540, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0393, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0429, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0392, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0397, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0428, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0559, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(3.7692e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0445, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0411, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0663, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0736, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0459, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0404, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0431, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0432, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0443, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0463, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0385, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0395, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0408, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0443, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0392, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0422, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0490, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0510, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0461, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0399, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0428, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0473, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0511, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0529, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0445, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0516, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0465, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0425, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0413, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0454, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0566, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0545, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0448, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0453, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0430, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0617, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0394, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0422, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0011, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0015, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0553, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(3.9976e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0013, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0541, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0020, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0513, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0536, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0459, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0798, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0415, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0465, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0411, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0407, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0912, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0431, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0415, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0458, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0730, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0641, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0439, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0379, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0657, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0525, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0467, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0413, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0412, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0403, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0401, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0445, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0835, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0396, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0383, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0548, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0434, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0553, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0385, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0395, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0460, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0404, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0387, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0437, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0547, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0398, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0413, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0411, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0423, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0385, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0454, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0404, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0395, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0439, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0549, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0663, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0446, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0403, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0667, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0533, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0509, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0420, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0468, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0433, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0579, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0411, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0447, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0485, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0434, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0432, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0430, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0379, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0620, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0461, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0540, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0468, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0619, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0394, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0469, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0439, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0409, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0771, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0425, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0398, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0418, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0431, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0392, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0466, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0025, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0629, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0450, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0505, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0015, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0961, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0442, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0400, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0568, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0446, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0407, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0544, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0748, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0546, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0396, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0567, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0652, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0396, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0501, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0424, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0399, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0434, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0567, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0444, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0429, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0445, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0417, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0389, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0398, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0379, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0422, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0401, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0555, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0472, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0426, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0427, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0453, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0001, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0513, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0426, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0411, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0649, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0399, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0524, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0389, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0463, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0001, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0453, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0410, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0406, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0413, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0429, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0941, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0430, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0481, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0414, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0384, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0503, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0451, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0430, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0535, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0540, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0385, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0445, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0532, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0404, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0388, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0398, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0001, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0413, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0387, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0621, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0921, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0555, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0404, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0636, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0413, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0482, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0002, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0006, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0009, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0707, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0547, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0487, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0475, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0418, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0386, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0678, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0470, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0602, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0444, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0561, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0452, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0454, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0628, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0391, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0462, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0383, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0384, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0395, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0565, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0420, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0406, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0553, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0463, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0468, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0438, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0454, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0441, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0408, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0606, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0494, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0486, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0746, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0482, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0512, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0568, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0425, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0394, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0020, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0424, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0435, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0412, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0482, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0567, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0404, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0410, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0413, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0505, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0433, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0476, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0437, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0398, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0601, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0405, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0391, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0394, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0786, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0010, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0414, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0454, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0441, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0387, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0489, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0418, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0431, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0019, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0001, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0597, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0508, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0020, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0012, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0421, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0442, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0461, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0402, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0422, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0425, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0427, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0883, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0397, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0379, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0443, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0440, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0413, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0584, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0387, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0434, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0389, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0456, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0399, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0432, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0418, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0384, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0452, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0481, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0487, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0478, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0849, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0024, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0403, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0917, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0464, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0406, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0563, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0385, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0020, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0608, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0003, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0017, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0420, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0712, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0470, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0485, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0507, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0471, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0462, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0578, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0413, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0480, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0023, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0543, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0557, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0599, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0436, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0523, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0412, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0439, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0488, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0395, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0382, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(9.4089e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0002, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0450, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0425, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0516, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0646, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0612, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0387, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0548, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0415, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0407, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0448, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0551, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0674, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0016, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0019, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0383, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0456, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0443, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0579, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0466, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0578, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0432, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0457, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0439, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0458, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0379, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0437, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0558, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0458, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0426, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0408, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0384, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0456, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0661, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0020, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0550, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0442, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0401, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0384, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0397, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0460, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0390, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0427, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0409, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0510, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0410, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0427, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0587, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0401, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0418, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0387, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0435, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0433, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0383, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0485, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0399, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0489, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0450, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0494, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0395, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0544, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0706, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0487, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0009, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0527, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0394, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0518, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0680, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0412, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0395, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0446, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0402, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0523, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0409, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0388, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0568, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0403, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0594, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0383, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0434, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0681, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0401, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0930, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0565, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0423, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0511, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0685, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0428, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0622, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0383, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0394, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0023, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0404, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0693, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0497, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0456, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0411, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0499, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0403, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0672, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0399, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0429, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0425, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0395, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0424, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0397, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0399, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0442, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0428, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0510, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0417, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0426, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0709, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0431, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0457, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0450, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0393, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0522, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0395, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0641, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0465, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0559, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0714, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0427, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0428, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0475, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0541, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0461, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0684, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0419, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0421, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0503, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0410, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0401, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0463, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0384, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0566, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0543, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0468, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0021, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0831, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0507, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0752, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0382, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0425, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0419, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0405, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0417, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0412, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0496, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0482, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0434, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0436, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0532, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0522, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0399, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0516, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0700, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0852, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0617, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0477, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0393, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0725, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0416, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0641, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0420, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0393, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0521, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0667, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0438, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0471, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0663, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0520, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0424, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0491, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0427, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0439, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0396, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0426, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0771, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0392, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0430, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0619, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0382, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0496, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0432, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0648, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0446, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0433, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0438, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0466, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0393, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0414, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0418, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0414, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0534, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0573, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0415, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0524, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0585, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0472, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0394, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0507, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0451, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0440, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0456, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0951, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0460, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0506, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0609, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0594, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0433, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0390, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0394, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0585, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0589, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0386, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0397, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0433, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0418, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0523, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0522, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0397, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0423, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0025, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0400, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0474, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0392, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0540, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0499, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0447, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0404, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0623, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0474, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0530, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0487, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0468, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0410, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0379, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0459, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0561, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0548, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0481, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0019, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0516, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0526, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0508, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0739, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0434, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0727, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0923, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0402, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0479, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0393, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0390, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0623, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0396, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0506, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0415, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0756, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0648, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0548, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0464, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0395, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0475, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0416, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0405, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0525, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0611, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0447, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0410, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0787, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0399, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0391, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0024, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0772, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0406, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0638, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0583, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0406, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0396, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0440, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0422, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0447, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0396, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0907, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0737, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0404, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0405, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0467, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0752, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0402, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0452, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0389, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0399, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0490, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0403, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0555, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0401, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0391, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0525, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0401, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0507, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0450, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0484, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0798, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0458, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0431, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0451, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0462, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0425, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0452, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0396, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0492, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0413, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0504, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0415, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0558, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0601, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0427, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0412, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0469, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0452, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0410, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0561, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0495, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0494, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0401, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0399, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0560, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0426, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0474, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0401, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0565, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0410, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0012, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0753, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0411, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0396, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0573, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0449, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0419, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0519, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0408, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0593, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0007, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0552, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0479, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0463, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0397, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0681, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0540, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0648, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0445, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0410, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0601, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0422, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0489, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0407, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0440, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0515, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0442, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0470, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0398, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0611, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0024, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0401, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0469, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0514, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0411, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0416, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0474, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0477, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0544, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0656, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0594, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0431, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0579, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0465, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0479, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0481, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0594, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0469, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0446, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0521, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0456, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0470, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0564, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0403, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0557, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0562, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0458, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0450, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0497, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0563, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0509, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0428, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0645, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0495, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0452, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0637, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0709, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0521, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0548, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0398, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0464, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0468, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0481, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0694, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0459, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0789, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0431, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0416, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0573, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0477, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0434, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0469, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0396, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0435, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0404, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0635, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0502, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0515, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0384, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0473, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0384, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0395, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0385, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0438, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0402, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0406, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0383, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1020, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0420, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0390, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0417, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0443, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0493, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0452, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0395, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0433, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0605, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0495, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0411, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0767, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0434, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0454, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0696, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0515, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0418, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0581, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0400, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0659, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0443, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0415, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0522, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0439, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0495, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0674, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0491, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0418, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0405, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0547, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0503, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0467, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0498, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0390, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0609, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0498, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0461, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0561, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0472, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0463, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0497, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0397, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0432, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0433, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0468, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0429, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0383, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0434, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0412, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0559, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0510, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0459, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0379, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0663, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0450, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0406, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0701, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0469, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0606, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0437, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0483, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0493, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0466, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0412, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0610, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0416, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0504, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0471, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0653, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0615, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0422, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0477, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0488, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0383, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0415, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0024, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0412, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0432, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0569, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0474, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0022, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0561, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0486, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0503, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0397, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0387, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0478, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0395, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0392, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0420, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0449, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0514, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0489, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0390, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0501, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0399, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0426, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0417, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0843, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0525, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0436, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0470, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0423, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0460, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0429, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0579, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0496, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0493, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0507, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0417, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0635, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0577, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0634, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0423, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0419, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0384, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0415, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0456, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0425, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0379, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0409, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0384, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0428, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0433, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0391, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0521, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0547, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0471, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0466, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0476, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0447, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0400, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0401, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0427, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0550, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0491, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0744, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0459, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0428, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0382, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0495, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0507, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0538, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0414, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0503, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0524, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0477, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0513, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0404, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0382, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0383, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0539, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0419, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0397, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0436, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0404, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0564, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0433, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1020, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0401, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0662, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0435, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0406, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0456, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0485, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0490, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0441, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0414, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0383, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0460, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0583, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0537, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0586, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0392, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0423, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0550, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0506, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0675, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0442, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0514, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0385, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0594, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0427, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0404, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0408, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0495, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0416, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0587, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0573, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0705, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0571, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0415, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0675, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0473, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0478, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0409, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0414, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0470, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0412, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0433, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0650, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0530, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0629, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0458, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0402, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0448, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0013, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0418, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0383, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0401, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0546, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0390, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0400, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0751, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0476, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0399, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0410, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0412, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0485, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0432, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0451, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0565, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0629, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0526, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0609, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0680, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0440, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0390, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0413, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0396, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0417, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0531, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0564, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0497, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0474, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0510, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0403, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0407, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0420, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0534, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0547, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0418, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0432, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0392, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0416, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0394, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0478, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0390, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0532, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0429, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0474, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0413, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0507, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0443, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0403, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0501, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0451, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0419, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0440, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0443, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0458, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0397, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0440, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0446, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0455, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0420, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0685, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0458, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0594, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0405, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0443, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0447, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0410, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0389, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0506, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0415, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0464, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0522, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0458, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0503, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0396, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0439, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0589, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0391, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0501, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0409, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0439, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0405, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0471, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0488, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0382, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0526, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0410, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0500, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0405, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0444, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0416, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0415, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0495, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0516, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0379, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0443, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0447, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0397, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0532, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0437, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0390, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0390, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0463, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0457, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0435, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0431, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0482, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0476, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0474, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0388, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0501, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0560, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0024, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0506, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0426, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0474, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0398, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0423, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0407, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0407, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0485, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0422, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0450, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0386, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0404, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0395, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0464, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0404, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0393, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0546, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0414, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0458, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0430, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0402, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0478, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0415, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0483, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0397, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0425, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0498, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0574, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0428, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0548, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0417, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0425, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0686, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0412, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0467, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0392, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0403, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0464, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0397, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0522, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0463, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0413, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0415, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0457, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0413, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0417, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0508, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0643, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0449, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0387, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0552, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0407, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0401, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0550, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0467, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0534, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0492, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0826, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0443, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0389, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0424, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0442, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0438, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0500, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0418, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0391, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0455, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0708, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0428, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0490, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0458, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0400, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0645, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0457, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0400, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0386, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0464, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0416, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0446, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0387, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0433, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0436, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0466, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0415, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0448, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0566, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0400, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0422, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0635, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0408, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0443, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0443, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0515, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0400, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0399, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0542, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0628, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0451, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0391, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0511, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0390, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0481, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0405, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0873, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0416, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0484, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0608, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0403, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0476, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0387, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0522, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0431, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0417, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0443, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0392, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0005, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0398, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0409, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0475, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0423, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0455, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0392, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0395, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0599, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0463, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0392, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0403, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0643, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0482, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0497, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0587, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0660, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0549, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0410, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0401, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0403, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0566, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0781, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0511, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0010, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0385, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0472, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0379, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0460, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0658, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0722, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0520, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0431, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0635, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0447, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0391, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0445, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0396, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0409, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0533, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0391, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0422, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0412, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0437, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0509, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0546, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0779, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0436, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0446, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0415, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0804, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0529, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0379, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0406, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0448, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0701, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0382, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0521, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0488, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0389, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0435, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0429, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0403, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0599, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0025, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0404, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0485, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0406, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0481, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0481, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0405, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0430, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0509, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0622, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0400, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0627, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0393, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0400, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0415, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0619, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0408, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0446, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0624, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0417, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0498, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0497, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0430, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0541, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0407, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0560, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0475, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0691, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0706, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0441, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0425, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0418, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0443, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0434, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0449, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0565, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0566, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0599, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0442, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0461, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0421, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0427, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0662, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0428, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0557, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0680, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0020, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0394, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0415, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0632, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0412, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0415, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0016, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0420, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0601, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0528, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0461, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0457, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0529, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0453, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0424, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0688, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0445, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0024, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0395, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0386, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0015, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0888, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0398, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0390, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0461, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0387, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0433, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0409, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0519, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0529, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0416, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0501, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0475, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0463, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0387, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0388, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0507, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0435, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0482, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0464, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0527, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0391, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0465, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0407, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0387, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0434, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0444, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0404, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0566, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0388, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0477, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0021, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0603, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0452, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0784, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0492, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0533, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0480, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0522, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0512, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0461, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0507, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0845, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0444, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0504, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0389, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0595, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0635, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0512, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0572, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0429, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0479, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0393, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0396, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0386, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0007, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0379, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0405, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0410, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0876, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0431, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0423, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0511, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0385, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0696, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0406, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0406, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0407, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0480, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0490, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0664, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0534, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0510, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0439, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0025, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0508, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0543, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0407, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0452, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0471, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0463, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0447, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0406, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0474, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0500, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0528, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0443, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0394, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0444, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0403, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0412, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0518, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0417, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0574, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0438, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0601, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0420, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0425, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0482, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0513, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0431, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0521, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0389, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0599, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0520, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0545, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0455, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0513, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0540, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0419, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0405, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0715, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0438, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0384, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0516, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0858, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0435, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0454, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0022, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0557, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0645, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0414, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0401, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0598, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0425, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0539, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0463, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0483, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0679, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0014, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0509, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0519, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0699, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0485, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0473, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0395, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0432, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0390, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0550, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0451, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0451, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0451, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0557, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0609, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0407, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0390, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0412, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0404, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0407, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0916, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0418, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0414, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0467, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0383, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0588, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0434, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0413, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0896, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0489, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0539, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0518, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0411, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0411, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0433, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0421, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0472, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0461, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0539, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0450, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0384, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0475, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0463, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0405, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0485, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0482, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0654, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0593, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0441, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0442, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0432, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0645, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0396, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0568, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0432, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0485, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0558, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0428, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0421, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0469, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0387, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0382, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0504, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0548, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0497, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0509, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0561, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0385, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0515, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0404, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0440, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0402, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0406, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0592, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0401, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0411, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0477, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0427, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0471, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0444, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0418, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0388, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0487, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0453, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0452, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0399, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0414, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0390, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0439, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0412, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0431, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0435, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0485, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0407, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0607, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0566, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0695, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0503, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0382, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0407, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0459, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0384, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0473, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0421, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0493, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0406, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0420, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0466, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0648, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0393, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0464, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0422, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0022, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0473, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0589, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0445, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0412, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0391, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0418, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0599, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0408, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0494, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0384, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0399, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0706, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0511, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0673, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0428, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0607, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0460, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0520, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0436, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0408, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0385, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0499, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0417, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0676, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0583, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0523, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0514, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0498, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0446, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0423, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0404, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0388, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0384, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0554, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0576, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0433, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0429, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0438, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0466, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0394, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0436, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0531, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0608, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0412, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0488, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0505, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0567, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0420, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0634, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0449, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0427, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0385, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0439, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0401, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0533, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0406, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0427, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0398, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0420, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0402, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0426, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0428, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0990, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0531, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0424, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0398, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0475, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0397, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0433, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0521, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0511, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0452, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0444, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0621, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0472, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0502, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0414, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0499, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0383, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0776, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0388, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0434, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0523, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0507, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0431, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0022, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0780, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0707, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0509, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0433, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0554, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0826, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0399, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0656, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0578, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0651, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0632, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0389, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0471, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0455, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0379, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0393, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0491, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0010, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0004, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0432, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0502, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0570, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0458, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0444, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0469, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0426, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0499, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0387, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0411, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0386, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0382, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0496, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0466, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0451, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0417, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0454, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0399, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0453, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0572, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0391, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0397, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0535, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0022, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0535, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0464, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0525, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0705, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0836, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0409, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0009, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0508, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0384, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0415, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0476, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0448, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0407, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0452, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0386, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0435, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0454, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0590, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0416, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0501, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0520, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0564, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0457, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0476, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0416, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0398, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0419, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0647, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0516, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0503, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0565, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0396, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0385, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0382, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0424, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0597, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0450, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0529, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0410, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0617, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0544, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0461, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0432, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0512, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0428, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0387, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0494, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0428, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0454, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0531, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0512, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0488, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0385, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0541, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0390, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0513, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0563, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0461, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0390, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0407, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0471, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0886, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0520, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0443, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0432, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0386, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0415, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0451, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0406, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0405, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0383, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0544, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0570, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0662, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0462, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0448, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0488, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0403, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0405, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0494, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0415, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0428, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0461, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0759, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0480, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0447, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0420, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0422, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0396, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0405, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0392, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0390, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0532, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0450, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0391, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0025, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0420, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(2.2502e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0467, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0470, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0939, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0548, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0443, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0658, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0505, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0497, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0451, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0440, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0906, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0408, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0402, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0548, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0804, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0403, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0406, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0023, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0555, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0467, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0569, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0523, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0453, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0441, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0500, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0435, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0410, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0538, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0453, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0428, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0415, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0554, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0416, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0388, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0445, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0642, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0396, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0392, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0390, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0457, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0538, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0383, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0593, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0504, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0379, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0482, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0433, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0468, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0425, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0605, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0392, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0546, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0413, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0390, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0463, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0478, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0402, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0440, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0755, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0457, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0454, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0418, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0390, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0614, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0453, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0415, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0020, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0017, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0002, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0013, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0016, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0019, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0650, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0415, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0394, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0603, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0403, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0002, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0586, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0415, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0408, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0423, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0387, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0402, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0018, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0416, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0393, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0625, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0458, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0417, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0519, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0488, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0474, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0416, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0455, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0480, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0475, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0398, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0390, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0005, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0406, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0474, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0420, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0540, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0517, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0588, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0533, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0471, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0401, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0384, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0498, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0399, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0431, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0487, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0407, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0490, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0468, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0422, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0493, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0473, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0404, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0420, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(2.0538e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0424, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0386, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0391, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0391, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0431, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0523, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0409, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0554, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0414, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0472, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0418, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0479, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0386, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0402, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0462, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0486, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0429, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0485, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0475, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0408, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0649, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0411, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0500, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0508, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0441, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0606, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(4.0868e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0002, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0018, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0023, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0005, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(1.5161e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0532, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0465, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0519, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0538, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0387, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0565, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0416, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0023, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0493, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0003, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0022, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0443, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0401, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0582, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0509, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0963, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0386, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0387, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0437, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0443, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0418, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0025, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0387, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0392, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0779, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0458, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0401, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0379, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0425, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0613, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0562, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0422, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0418, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0502, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0468, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0577, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0484, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0403, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0552, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0692, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0412, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0421, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0399, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0424, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0396, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0431, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0481, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0024, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0390, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0505, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0589, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0467, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0410, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0504, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0565, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0430, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0470, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0488, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0399, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0516, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0403, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0412, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0402, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0619, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0635, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0466, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0485, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0455, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0396, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0435, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0395, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0516, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0794, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0556, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0742, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0456, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0451, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0001, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0453, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0018, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0003, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(8.3371e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0440, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0447, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0021, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0592, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0463, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0653, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0721, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0384, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0553, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0456, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0840, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0390, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0585, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0388, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0006, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0023, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0018, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0022, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(2.9153e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0429, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0007, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0530, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0416, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0022, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0008, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0445, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0001, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0402, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0611, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0016, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0438, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0423, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0006, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0834, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0403, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0020, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0466, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0436, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0848, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0467, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0557, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0462, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0401, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0393, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0441, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0569, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0430, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0791, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0453, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(3.5683e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0608, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0642, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0440, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0386, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0595, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0510, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0418, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0386, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0417, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0541, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0427, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0514, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0436, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0447, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0402, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0567, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0611, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0395, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0935, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0384, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0395, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0442, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0596, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0014, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0631, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0379, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0437, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0602, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0434, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0569, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0015, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0555, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0019, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0018, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0005, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0002, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0010, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(1.3848e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0015, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0439, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0428, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0004, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0025, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(1.7066e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0009, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0477, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(3.6214e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0007, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0008, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(2.1774e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0024, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0003, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0003, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0022, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0009, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(1.6179e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0008, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0006, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0002, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0491, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0025, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(4.4190e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(1.3593e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(1.3593e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(1.3583e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0416, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0382, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0653, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0450, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0561, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0382, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0393, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(2.4263e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0379, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0434, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0634, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0688, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0463, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0464, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0544, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0606, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0444, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0445, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0779, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0431, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0404, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0414, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0428, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0495, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0385, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0407, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0454, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0537, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0656, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0472, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0414, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0517, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0432, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0438, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0509, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0510, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0412, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0387, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0552, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0408, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0384, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0401, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0386, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0479, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(2.2372e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0025, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0014, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0004, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(4.1009e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0001, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0005, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0002, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(2.2231e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0012, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0014, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0014, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0008, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(1.4440e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0401, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(5.7082e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0015, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0017, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0019, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0012, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0006, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(7.4429e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0019, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(1.7414e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(4.6022e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(7.5436e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0021, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0002, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0024, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0438, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0014, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0014, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0018, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0018, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0004, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0016, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0016, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(2.7172e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0021, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0023, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(2.4027e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(1.2502e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0004, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(1.6605e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(1.4834e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0020, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(1.7345e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(4.6948e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0011, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0007, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(1.2494e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0597, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0432, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0424, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0749, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0416, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0010, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0010, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0010, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0010, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0010, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0010, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0010, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0010, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0010, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0010, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0009, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0009, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0009, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0409, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0457, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0006, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0576, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0568, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0546, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0408, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0675, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0434, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0406, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0390, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0788, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0495, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0403, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0508, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0400, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0771, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0438, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0452, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0405, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0435, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0433, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0006, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0012, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0446, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0482, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0412, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(3.9792e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0011, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0024, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0007, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0543, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0005, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0006, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0008, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0413, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0008, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0449, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0011, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0011, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0008, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0513, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0022, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0490, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0003, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0020, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0008, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0007, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0010, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0453, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0801, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0440, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0531, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0015, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0001, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0485, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0392, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0413, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0417, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0564, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0562, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0414, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0447, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0008, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0424, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0023, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0496, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0535, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0571, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0493, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0661, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0004, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0008, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(1.5934e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0008, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0002, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0001, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(1.0807e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0015, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0024, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0404, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0019, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(8.2803e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0018, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0417, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0020, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0010, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0009, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0666, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0025, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0404, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0549, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0453, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0506, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0431, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0018, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0009, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0445, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0421, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0448, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0396, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0709, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0439, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0399, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0523, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0404, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0542, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0396, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0457, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0423, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0389, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0495, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0504, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0713, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0383, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0466, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0400, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0415, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0413, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0558, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0418, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0619, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0446, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0018, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0708, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0453, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0535, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0649, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0414, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0580, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0591, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0016, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0006, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0025, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0009, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0020, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0379, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0420, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0524, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0413, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0488, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0391, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0418, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0484, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0415, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0653, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0449, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0394, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0468, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0421, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0417, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0447, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0524, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0513, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0653, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0608, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0394, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0501, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0437, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0387, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0424, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0467, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0602, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0430, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0768, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0417, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0477, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0427, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0395, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0537, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0409, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0478, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0467, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0699, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0389, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0498, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0491, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0416, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0507, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0395, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0430, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0414, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0924, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0745, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0394, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0395, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0449, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0503, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0550, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0678, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0418, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0383, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0412, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0505, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0473, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0504, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0382, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0405, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0508, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0733, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0630, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0438, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0603, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0397, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0456, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0559, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0606, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0492, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0410, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0411, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0429, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0472, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0536, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0444, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0660, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0546, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0468, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0522, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0444, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0440, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0448, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0467, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0467, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0532, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0434, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0389, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0624, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0576, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0830, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0457, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0545, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0017, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0489, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0396, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0408, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0427, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0457, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0438, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0414, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0447, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0452, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0437, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0560, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0483, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0379, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0388, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0485, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0442, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0438, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0437, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0450, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0387, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0648, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0390, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0019, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0379, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0497, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0438, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0515, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0433, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0452, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0430, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0475, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0491, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0693, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0390, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0507, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0454, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0497, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0431, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0513, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0413, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0469, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0522, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0442, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0393, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0638, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0442, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0458, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0409, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0401, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0517, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0438, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0458, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0448, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0399, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0483, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0383, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0475, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0459, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0523, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0395, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0471, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0392, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0418, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0440, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0386, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0484, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0551, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0429, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0550, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0475, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0441, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0467, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0457, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0652, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0399, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0429, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0473, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0687, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0432, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0385, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0609, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0391, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0384, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0482, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0430, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0402, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0387, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0434, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0453, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0410, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0557, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0513, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0497, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0475, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0421, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0429, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0439, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0428, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0495, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0617, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0384, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0482, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0534, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0506, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0437, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0469, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0399, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0389, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0472, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0601, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0401, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0439, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0385, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0397, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0447, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0412, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0535, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0458, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0415, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0549, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0764, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0454, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0593, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0418, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0417, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0588, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0495, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0382, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0420, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0400, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0391, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0423, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0424, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0455, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0497, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0765, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0515, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0497, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0404, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0474, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0594, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0465, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0644, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0623, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0589, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0619, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0437, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0422, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0634, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0677, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0379, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0498, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0484, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0395, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0383, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0448, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0675, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0609, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0688, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0432, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0602, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0753, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0396, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0638, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0476, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0411, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0421, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0597, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0398, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0382, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0421, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0518, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0608, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0401, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0424, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0516, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0501, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0530, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0434, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0528, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0401, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0398, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0389, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0425, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0415, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0448, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0465, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0411, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0549, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0526, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0511, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0462, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0625, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0737, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0015, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0524, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0517, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0458, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0491, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0476, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0429, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0412, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0529, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0528, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0876, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0427, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0449, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0392, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0388, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0403, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0408, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0394, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0382, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0591, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0019, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0573, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0411, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0748, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0492, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0544, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0471, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0418, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0469, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0425, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0025, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0448, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0382, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0425, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0512, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0403, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0519, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0564, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0447, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0385, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0471, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0388, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0406, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0460, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0401, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0478, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0471, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0578, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0575, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0382, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0426, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0430, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0516, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0023, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0421, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0415, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0587, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0422, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0424, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0389, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0443, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0514, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0493, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0461, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0463, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0519, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0382, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0408, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0385, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0008, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0633, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0011, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0391, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1017, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0418, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0471, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0557, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0465, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0022, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0405, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0401, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0435, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0396, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0382, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0435, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0432, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0414, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0399, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0460, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0436, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0429, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0518, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0449, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0412, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0384, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0582, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0537, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0433, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0398, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0432, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0393, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0435, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0417, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0522, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0416, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0459, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0418, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0392, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0767, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0511, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0467, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0550, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0390, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0404, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0603, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0626, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0010, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0403, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0437, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0383, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0022, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0669, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0545, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0398, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0402, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0387, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0415, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0458, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0414, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0537, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0428, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0450, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0407, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0498, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0517, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0379, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0456, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0461, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0393, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0500, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0443, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0390, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0457, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0472, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0396, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0405, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0429, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0546, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0441, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0420, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0393, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0515, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0424, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0383, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0705, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0385, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0472, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0492, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0418, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0487, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0516, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0453, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0407, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0503, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0396, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0397, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0406, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0439, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0470, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0433, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0392, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0410, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0415, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0389, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0015, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0001, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0442, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0390, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0417, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0684, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0462, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0448, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0602, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0386, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0399, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0421, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0427, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0011, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0507, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0472, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0460, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0472, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0532, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0518, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0638, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0492, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0536, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0447, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0424, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0410, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0435, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0677, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0452, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0568, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0406, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0464, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0431, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0422, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0414, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0530, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0445, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0022, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0419, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0482, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0448, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0436, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0585, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0459, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0441, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0384, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0414, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0481, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0448, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0737, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0390, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0687, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0467, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0547, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0388, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0423, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0643, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0401, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0633, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0017, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0612, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0475, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0390, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0466, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0509, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0435, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0471, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0438, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0414, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0398, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0496, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0400, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0007, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0440, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0516, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0483, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0507, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0453, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0022, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0424, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0492, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0760, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0404, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0404, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0387, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0614, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0416, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0416, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0015, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0536, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0388, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0390, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0401, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0450, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0412, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0463, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0433, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0590, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0413, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0404, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0408, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0398, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0389, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0416, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0428, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0507, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0017, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0566, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0436, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0019, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0586, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0429, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0468, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0485, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0386, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0549, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0824, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0405, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0392, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0398, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0389, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0569, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0385, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0439, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0420, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0018, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0450, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0545, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0596, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0398, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0399, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0400, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0477, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0526, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0464, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0391, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0401, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0407, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0017, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0435, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0408, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0533, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0441, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0431, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0405, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0486, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0565, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0552, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0476, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0025, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0387, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0520, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0386, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0392, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0386, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0389, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0404, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0383, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0458, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0452, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0389, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0025, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0537, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0425, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0453, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0023, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0383, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0767, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0626, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0403, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0456, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0806, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0416, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0385, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0409, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0384, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0427, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0445, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0413, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0443, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0485, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0536, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0432, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0500, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0423, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0417, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0513, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0385, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0430, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0390, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0479, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0444, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0390, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0424, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0507, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0487, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0612, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0446, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0014, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0383, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0503, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0609, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0417, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0414, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0677, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0394, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0434, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0407, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0509, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0507, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0463, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0394, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0382, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0575, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0404, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0396, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0478, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0400, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0462, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0405, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0397, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0433, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0433, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0447, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0475, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0397, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0477, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0444, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0439, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0395, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0425, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0431, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0464, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0421, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0474, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0401, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0426, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0473, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0770, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0450, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0395, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0400, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0539, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0465, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0379, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0487, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0583, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0452, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0437, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0494, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0413, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0452, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0428, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0387, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0437, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0446, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0391, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0491, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0014, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0390, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0407, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0695, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0654, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0017, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0651, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0399, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0509, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0495, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0503, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0432, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0460, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0023, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0459, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0603, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0406, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0507, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0413, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0409, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0676, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0795, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0397, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0442, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0440, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0524, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0533, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0439, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0387, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0471, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0383, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0437, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0395, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0409, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0710, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0445, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0393, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0382, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0648, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0419, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0391, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0496, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0409, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0013, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0519, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0414, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0015, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0023, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0398, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0410, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0396, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0391, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0667, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0014, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0446, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0490, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0474, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0020, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0483, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0445, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0395, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0382, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0490, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0483, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0444, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0464, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0457, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0462, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0569, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0391, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0398, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0018, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0445, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0446, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0386, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0570, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0428, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0673, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0414, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0400, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0400, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0417, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0424, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0465, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0644, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0475, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0012, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0530, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0435, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0017, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0408, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0407, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0742, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0011, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0534, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0572, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0018, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0418, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0502, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0414, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0444, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0396, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0430, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0479, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0391, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0507, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0398, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0020, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0007, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0561, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0531, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0457, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0469, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0641, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0418, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0505, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0467, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0444, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0577, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0555, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0412, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0487, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0441, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0414, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0025, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0386, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0014, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0429, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0385, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0466, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0012, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0441, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0411, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0517, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0451, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0013, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0461, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0385, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0383, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0385, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0471, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0390, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0379, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0004, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0431, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0551, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0023, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0016, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0497, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0379, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0423, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0025, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0013, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0489, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(2.6158e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(6.8000e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0481, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0005, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0013, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0025, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0530, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0427, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0586, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0448, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0521, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0010, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0510, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0014, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0006, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0017, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0467, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0407, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0458, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0489, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0021, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0470, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0386, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0023, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0001, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0004, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0012, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0435, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0471, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0385, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0602, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0583, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0797, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0440, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0002, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0020, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0536, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0025, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0442, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0023, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0521, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0023, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0526, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0016, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0016, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0009, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0015, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0760, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0388, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0410, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0427, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0015, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0406, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0009, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0007, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(5.2866e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(6.1187e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0004, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(4.2606e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(3.9262e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(3.8731e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0008, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0012, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0011, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(3.9009e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0007, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(4.5120e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(3.9343e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0008, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0018, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(4.8966e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0007, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(4.7473e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0002, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(4.0653e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(5.8819e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(4.0451e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0003, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(5.6042e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(4.1363e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(5.5725e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(4.9296e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(6.2763e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(3.8077e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(4.5930e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(3.7372e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(5.5256e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0006, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(3.6954e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(3.5936e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(4.2766e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(3.8736e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0002, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(3.7611e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(3.7442e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(3.9167e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(4.2086e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(3.7029e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(3.8749e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(3.5230e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(3.5592e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(3.4700e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0005, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(5.3821e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(5.3657e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0005, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(4.8247e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(3.4214e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(4.7483e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0011, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(3.9119e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0001, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(3.7063e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(4.1907e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0024, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0005, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(3.7794e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(3.3029e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(3.3083e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(3.5593e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(4.6445e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0019, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(3.2726e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(3.8635e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0001, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0011, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(3.7081e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0009, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(4.1099e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(3.2780e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(3.4307e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(4.4132e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(3.5054e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(3.6361e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(3.5739e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0004, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0005, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(4.8474e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(3.2643e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(6.2599e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0006, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(3.0770e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(4.5239e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0004, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0005, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0005, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0015, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(3.5391e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0004, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0003, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(3.0779e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0004, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0001, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0004, device='cuda:0', grad_fn=<NllLossBackward>)
--------------EPOCH 4-------------
Test Accuracy: tensor(0.4180, device='cuda:0')
Loss: tensor(0.1070, device='cuda:0', grad_fn=<NllLossBackward>)
Saving model at iteration: 0
Loss: tensor(0.0550, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0479, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0710, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0644, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0414, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0493, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0411, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0646, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0427, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0577, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0628, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0530, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0757, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0473, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0466, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0527, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0954, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0836, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0516, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1022, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0423, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0623, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0408, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0559, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0698, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0789, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0439, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0563, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0388, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0614, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0415, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0409, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0926, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0593, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0489, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0400, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0541, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0558, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0419, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0604, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0399, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0397, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0432, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0521, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0465, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0522, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0515, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0504, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0518, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0467, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0506, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0541, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0921, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0522, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0566, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0862, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0469, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0980, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0520, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0608, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0485, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0563, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0575, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0736, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0690, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0605, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0801, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0525, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0600, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0434, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0821, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0507, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0754, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0694, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0541, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0510, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0562, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0634, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0429, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0503, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0630, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0671, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0405, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0474, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0530, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0577, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0517, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0461, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0441, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0662, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0643, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0816, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0520, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0401, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0577, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0481, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0672, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0429, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0387, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0710, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0579, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0399, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0553, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0517, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0891, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0481, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0469, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0459, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0412, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0690, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0397, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0603, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0779, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0650, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0848, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0598, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0433, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0775, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0862, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0383, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0496, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0684, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0582, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0594, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0509, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0559, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0431, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0575, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0533, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0825, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0501, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0449, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0772, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0510, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0784, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0443, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0616, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0536, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0476, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0522, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0664, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0467, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0449, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0383, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0438, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0478, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0492, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0590, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0543, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0389, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0445, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0625, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0689, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0520, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0766, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0614, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0434, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0579, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0604, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0519, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0470, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0615, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0775, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0457, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0393, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0591, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0529, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0551, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0399, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0430, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0651, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0424, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0429, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0466, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0531, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0894, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0794, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0593, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0601, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0730, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0815, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0607, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0712, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0684, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0424, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0635, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0507, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0471, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0480, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0597, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0854, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0734, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0488, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0551, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0489, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0385, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0727, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0401, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0835, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0434, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0476, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0666, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0626, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0751, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0503, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0658, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0475, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0584, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0682, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0537, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0463, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0740, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0733, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0630, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0567, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0648, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0546, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0596, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0383, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0412, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0438, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0392, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0720, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0568, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0396, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0487, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0491, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0624, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0652, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0651, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0435, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0573, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0512, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0395, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0525, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0518, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0491, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0501, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0594, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0680, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0392, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0518, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0719, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0416, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0535, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0535, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0516, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0613, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0428, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0456, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0478, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0522, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0569, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0566, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0482, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0609, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0465, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0440, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0831, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0559, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0490, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0435, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0443, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0496, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0441, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0540, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0541, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0693, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0418, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0555, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0384, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0626, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0484, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0404, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0571, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0493, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0388, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0540, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0394, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0590, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0422, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0547, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0643, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0796, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0773, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0590, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0419, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0383, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0544, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0468, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0821, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0542, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0403, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0862, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0488, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0458, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0559, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0576, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0475, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0715, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0658, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0621, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0487, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0433, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0544, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0490, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0408, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0417, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0407, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0657, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0398, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0696, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0427, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0642, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0510, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0463, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0450, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0535, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0579, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0762, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0683, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0702, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0461, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0615, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0912, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0406, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0520, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0496, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0536, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0454, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0448, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0542, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0470, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0537, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0432, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0412, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0675, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0565, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0760, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0554, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0532, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0464, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0449, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0876, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0522, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0966, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0861, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0762, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0390, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0904, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0631, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0512, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0610, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0629, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0472, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0420, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0488, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0513, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0535, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0669, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0416, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0510, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0532, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0709, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0837, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0566, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0603, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0382, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0538, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0481, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0578, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0532, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0848, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0385, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0536, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0488, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0388, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0498, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0382, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0737, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0467, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0502, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0505, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0646, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0669, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0709, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0393, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0784, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0548, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0440, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0610, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0658, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0424, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0678, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0566, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0681, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0388, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0438, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0433, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0545, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0491, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0526, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0413, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0514, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0549, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0401, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0895, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0422, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0435, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0495, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0734, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0438, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0540, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0759, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0398, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0644, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0618, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0684, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0384, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0559, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0782, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0651, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0816, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0746, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0597, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0452, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0467, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0644, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0539, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0531, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0544, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0425, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0489, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0587, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0922, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0545, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0567, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0565, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0676, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0426, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0738, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0524, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0719, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0425, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0692, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0497, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0452, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0503, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0507, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0643, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0441, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0460, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0743, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0582, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0709, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0549, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0454, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0465, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0439, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0590, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0513, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0447, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0833, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0535, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0555, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0410, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0554, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0572, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0764, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0488, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0742, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0651, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0420, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0420, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0547, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0728, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0394, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0649, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0783, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0455, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0901, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0770, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0522, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0540, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0501, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0611, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0803, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0586, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0681, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0488, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0769, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0532, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0505, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0738, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0498, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0508, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0636, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0438, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0674, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0701, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0831, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0517, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0738, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0566, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0841, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0774, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0557, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0502, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0571, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0567, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0769, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0401, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0398, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0518, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0616, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0437, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0712, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0578, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0511, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0559, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0587, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0456, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0574, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0477, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0519, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0439, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0627, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0509, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0532, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0459, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0441, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0711, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0616, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0930, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0809, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0480, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0592, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0506, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0616, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0564, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0405, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0627, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0379, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0407, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0412, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0675, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0632, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0499, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0533, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0522, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0456, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0447, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0637, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0631, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0861, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0439, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0550, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0564, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0567, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0620, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0563, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0519, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0647, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0739, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0411, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0586, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0650, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0587, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0648, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0468, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0612, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0528, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0464, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0515, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0610, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0391, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0462, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0470, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0428, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0559, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0711, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0666, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0612, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0738, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0644, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0613, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0482, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0439, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0403, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0471, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0522, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0402, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0531, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0439, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0679, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0656, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0630, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0741, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0579, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0649, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0673, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0673, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0535, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0922, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0731, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0553, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0579, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0537, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0555, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0493, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0404, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0536, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0401, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0394, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0818, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0478, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0721, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0578, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0437, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0519, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0814, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0638, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0430, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0411, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0566, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0973, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0733, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0519, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0619, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0582, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0616, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0390, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0913, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0565, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0650, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0396, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0990, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0419, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0598, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0659, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0419, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0442, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0553, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0691, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0397, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0850, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0529, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0712, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0769, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0485, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0547, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0549, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0892, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0586, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0681, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0439, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0659, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0882, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0647, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0714, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0408, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0790, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0549, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0604, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0461, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0560, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0576, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0525, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0683, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0440, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0564, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0618, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0916, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0570, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0642, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0824, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0569, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0618, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0546, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0465, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0405, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0786, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0533, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0446, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0492, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0598, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0736, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0413, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0778, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0488, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0593, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0934, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0447, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0843, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0403, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0383, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0557, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0546, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0495, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0983, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0490, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0542, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0455, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0851, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0494, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0548, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0513, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0478, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0379, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0765, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0698, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0649, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0494, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0459, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0503, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0591, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0792, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0490, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0610, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0411, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0513, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0738, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0570, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0690, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0444, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0704, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0919, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0557, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0552, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0663, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0629, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0807, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0581, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0452, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0717, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0745, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0659, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0436, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0432, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0639, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0499, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0448, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0479, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0596, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0635, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0737, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0553, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0623, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0500, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0461, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0477, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0463, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0821, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0570, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0493, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0661, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0689, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0389, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0580, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0744, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0399, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0529, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0477, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0556, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0513, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0689, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0530, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0623, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0602, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0864, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0594, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0440, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0563, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0440, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0592, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0469, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0500, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0601, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0523, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0418, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0457, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0553, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0391, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0385, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0801, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0518, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0590, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0399, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0561, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0400, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0522, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0641, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0405, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0654, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0721, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0608, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0485, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0747, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0669, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0613, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0756, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0470, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0483, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0503, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0629, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0590, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0527, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0468, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0386, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0572, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0531, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0429, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0399, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0671, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0431, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0593, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0397, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0444, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0431, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0545, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0470, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0939, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0675, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0682, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0465, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0561, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0686, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0489, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0539, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0417, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0480, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0648, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0554, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0506, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0420, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0513, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0544, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0552, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0401, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0749, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0473, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0453, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0411, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0512, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0400, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0391, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0556, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0502, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0442, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0446, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0569, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0671, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0771, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0671, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0521, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0660, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0420, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0517, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0588, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0553, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0851, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0602, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0610, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0471, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0425, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0824, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0597, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0482, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0442, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0411, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0749, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0527, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0777, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0563, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0716, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0609, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0534, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0681, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0383, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0634, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0464, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0717, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0765, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0876, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0483, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0585, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0503, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0588, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0778, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0622, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0744, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0409, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0634, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0817, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0613, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0441, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0452, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0513, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0528, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0785, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0562, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0652, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0385, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0862, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0614, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0697, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0460, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0496, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0635, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0574, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0453, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0585, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0429, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0495, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0562, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0723, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0612, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0456, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0473, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0830, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0514, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0806, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0856, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0417, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0517, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0752, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0842, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0533, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0617, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0461, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0788, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0699, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0701, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0501, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0647, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0421, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0413, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0391, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0475, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0634, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0482, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0593, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0411, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0598, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0943, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0449, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0393, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0512, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0667, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0466, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0455, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0518, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0537, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0451, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0521, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0648, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0717, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0390, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0712, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0632, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0605, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0754, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0541, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0536, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0407, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0635, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0860, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0600, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0426, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0431, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0567, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0531, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0478, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0391, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0520, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0623, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0526, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0730, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0502, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0429, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0550, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0482, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0628, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0428, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0611, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0512, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0435, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0520, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0486, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0385, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0403, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0601, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0741, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0430, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0624, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0396, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0808, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0586, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0607, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0884, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0512, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0609, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0511, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0701, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0667, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0654, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0386, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0649, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0526, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0725, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0493, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0478, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0498, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0735, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0693, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0402, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0508, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0562, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0511, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0629, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0569, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0527, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0453, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0503, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0535, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0474, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0710, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0552, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0421, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0534, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0422, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0723, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0622, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0382, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0409, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0600, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0430, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0540, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0434, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0382, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0426, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0631, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0600, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0540, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0452, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0519, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0480, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0486, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0643, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0597, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0586, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0441, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0406, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0580, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0680, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0549, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0678, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0506, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0441, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0503, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0768, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0528, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0599, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0576, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0512, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0401, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0520, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0490, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0753, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0470, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0431, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0443, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0389, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0644, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0622, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0611, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0668, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0528, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0573, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0398, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0644, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0653, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0398, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0390, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0630, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0589, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0721, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0520, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0593, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0664, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0629, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0719, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0460, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0954, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0590, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0577, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0501, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0445, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0534, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0568, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0859, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0426, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0606, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0536, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0674, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0641, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0682, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0838, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0412, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0416, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0591, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0649, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0564, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0512, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0708, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0578, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0471, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0544, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0519, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0500, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0508, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0441, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0780, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0772, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0521, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0595, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0506, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0672, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0589, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0650, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0519, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0733, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0525, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0686, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0691, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0447, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0802, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0928, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0734, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0450, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0485, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0383, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0417, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0733, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0710, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0455, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0569, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0766, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0451, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0436, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0426, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0548, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0702, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0421, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0687, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0604, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0450, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0665, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0529, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0421, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0450, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0522, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0605, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0496, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0611, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0483, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0549, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0435, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0497, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0585, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0739, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0414, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0445, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0533, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0544, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0437, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0511, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0454, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0570, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0444, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0830, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0397, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0679, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0431, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0654, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0528, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0502, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0506, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0887, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0901, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0582, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0389, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0435, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0388, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0385, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0433, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0437, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0392, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0837, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0400, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0906, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0673, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0391, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0813, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0508, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0594, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0537, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0667, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0826, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0494, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1012, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0470, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0513, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0547, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0643, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0519, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0545, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0536, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0725, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0396, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0500, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0872, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0729, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0699, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0577, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0830, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0393, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0621, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0655, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0478, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0743, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0442, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0648, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0625, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0448, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0651, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0607, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0532, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0670, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0551, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0751, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0408, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0590, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0509, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0415, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0539, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0844, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0512, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0422, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0664, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0566, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0607, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0484, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0429, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0873, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0621, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0472, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0569, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0575, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0656, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0553, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0501, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0401, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0850, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0480, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0909, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0482, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0498, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0450, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0523, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0634, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0471, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0886, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0430, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0549, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0524, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0449, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0661, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0424, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0469, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0566, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0421, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0446, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0518, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0521, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0596, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0596, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0710, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0828, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0414, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0612, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0853, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0518, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0417, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0420, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0499, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0566, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0565, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0993, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0507, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0498, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0468, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0386, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0468, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0496, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0588, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0516, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0487, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0410, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0641, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0410, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0684, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0546, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0437, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0709, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0531, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0615, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0591, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0570, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0653, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0539, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0852, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0558, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0645, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0691, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0774, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0773, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0415, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0663, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0490, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0814, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0523, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0636, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0391, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0778, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0474, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0897, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0534, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0670, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0601, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0411, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0409, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0398, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0430, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0422, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0668, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0386, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0436, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0403, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0529, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0568, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0482, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0640, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0679, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0443, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0668, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0568, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0599, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0415, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0657, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0434, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0792, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0420, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0504, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0443, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0576, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0443, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0493, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0379, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0421, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0474, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0436, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0611, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0711, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0815, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0458, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0574, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0738, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0598, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0458, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0636, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0420, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0594, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0455, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0691, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0538, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0563, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0586, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0379, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0671, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0411, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0441, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0836, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0697, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0396, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0748, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0549, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0493, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0985, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0435, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0558, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0409, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0502, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0481, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0546, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0733, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0555, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0483, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0721, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0452, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0587, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0612, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0469, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0717, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0919, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0703, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0455, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0569, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0441, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0426, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0440, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0661, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0423, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0582, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0703, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0660, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0573, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0786, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0574, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0513, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0529, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0863, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0536, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0543, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0779, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0423, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0739, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0459, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0505, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0529, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0492, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0566, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0537, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0457, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0692, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0974, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0434, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0601, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0714, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0467, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0531, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0774, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0608, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0470, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0659, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0455, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0382, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0457, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0473, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0673, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0379, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0818, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0442, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0559, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0419, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0656, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0385, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0503, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0657, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0885, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0549, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0655, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0632, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0415, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0733, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0618, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0624, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0715, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0459, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0497, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0415, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0758, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0437, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0535, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0862, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0520, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0814, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0494, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0749, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0503, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0382, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0529, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0547, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0573, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0855, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0489, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0462, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0743, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0409, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0484, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0412, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0588, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0798, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0687, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0777, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0743, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0488, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0452, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0741, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0416, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0696, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0446, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0434, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0379, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0564, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0387, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0392, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0566, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0616, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0498, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0593, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0451, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0434, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0632, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0714, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0567, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0621, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0566, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0517, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0732, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0513, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0492, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0468, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0894, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0607, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0594, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0423, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0468, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0757, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0961, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0478, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0447, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0649, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0554, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0540, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0410, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0668, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0630, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0722, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0499, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0629, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0578, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0934, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0406, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0677, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0548, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0646, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0825, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0493, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0586, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0498, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0609, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0441, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0644, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0546, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0646, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0780, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0473, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0495, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0453, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0616, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0677, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0502, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0489, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0468, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0561, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0504, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0542, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0749, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0513, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0474, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0458, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0525, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0723, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0697, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0396, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0561, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0770, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0462, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0938, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0589, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0780, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0502, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0643, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0561, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0449, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0448, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0663, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0386, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0430, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0474, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0614, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0424, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0493, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0554, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0473, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0599, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0933, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0386, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0602, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0632, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0449, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0482, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0455, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0606, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0423, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0649, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0501, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0504, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0910, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0408, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0621, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0528, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0750, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0847, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0817, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0559, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0605, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0834, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0437, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0665, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0638, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0729, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0403, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0428, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0756, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0572, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0521, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0555, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0408, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0711, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0498, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0842, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0768, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0451, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0394, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0637, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0535, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0554, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0478, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0553, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0767, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0473, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0405, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0425, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0398, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0382, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0641, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0487, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0396, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0638, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0781, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0694, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0460, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0542, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0640, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0472, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0382, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0500, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0385, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0497, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0562, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0513, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0646, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0462, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0435, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0542, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0630, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0618, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0470, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0447, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1010, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0906, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0490, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0437, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0479, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0784, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0840, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0543, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0653, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0707, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0688, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0800, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0563, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0534, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0406, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0619, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0519, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0487, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0478, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0400, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0749, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0590, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0482, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0579, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0694, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0457, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0756, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0649, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0555, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0437, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0452, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0620, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0643, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0638, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0478, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0453, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0471, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0717, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0507, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0655, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0958, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0669, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0487, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0630, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0521, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0556, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0556, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0644, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0666, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0528, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0703, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0523, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0429, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0547, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0444, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0713, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0555, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0568, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0585, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0858, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0879, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0582, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0645, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0416, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0584, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0444, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0437, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0717, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0533, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0508, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0479, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0445, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0518, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0725, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0491, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0432, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0391, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0799, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0716, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0419, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0414, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0867, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0468, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0469, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0577, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0492, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0611, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0669, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0566, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0413, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0470, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0588, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0619, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0393, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0601, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0555, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0779, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0498, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0452, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0417, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0477, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0406, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0595, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0530, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0817, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0593, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0521, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0472, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0479, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0451, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0559, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0634, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0587, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0439, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0570, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0416, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0547, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0447, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0976, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0645, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0890, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0585, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0938, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0420, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0488, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0746, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0434, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0476, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0428, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0428, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0529, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0398, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0423, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0581, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0459, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0554, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0534, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0442, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0615, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0678, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0542, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0434, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0400, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0646, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0723, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0418, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0587, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0481, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0382, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1024, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0810, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0483, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0434, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0383, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0752, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0743, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0697, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0795, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0462, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0516, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0625, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0650, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0676, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0689, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0460, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0669, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0568, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0411, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0379, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0425, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0463, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0489, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0487, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0418, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0384, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0457, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0458, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0796, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0489, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0459, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0471, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0620, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0561, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0435, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0459, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0387, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0882, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0522, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0522, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0791, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0440, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0557, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0675, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0549, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0466, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0450, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0493, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0470, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0650, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0402, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0677, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0477, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0745, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0542, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0456, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0524, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0445, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0407, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0540, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0821, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0677, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0446, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0587, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0686, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0513, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0477, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0494, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0905, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0475, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0511, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0402, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0767, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0848, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0396, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0951, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0439, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0521, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0928, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0437, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0485, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0395, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0745, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0460, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0581, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0689, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0449, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0755, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0589, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0599, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0387, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0453, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0511, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0650, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0520, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0785, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0441, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0800, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0503, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0479, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0509, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0713, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0389, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0506, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0405, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0424, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0592, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0507, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0708, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0460, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0480, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0559, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0638, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0559, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0640, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0476, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0446, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0443, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0466, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0499, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0608, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0421, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0628, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0482, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0414, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0805, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0609, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0476, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0535, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0582, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0936, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0422, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0531, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0587, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0764, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0479, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0928, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0672, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0774, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0440, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0530, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0770, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0658, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0764, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0794, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0427, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0389, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0545, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0639, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0589, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0671, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0416, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0599, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0394, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0564, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0755, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0631, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0553, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0777, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0452, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0582, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0551, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0516, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0413, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0615, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0720, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0676, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0479, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0654, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0440, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0382, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0746, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0419, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0434, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0802, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0699, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0795, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0716, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0791, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0447, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0806, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0539, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0519, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0672, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0816, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0601, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0422, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0509, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0488, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0668, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0495, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0385, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0484, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0486, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0515, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0424, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0671, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0390, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0602, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0429, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0520, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0619, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0861, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0471, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0513, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0494, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0912, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0640, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0646, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0445, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0607, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0520, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0664, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0634, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0441, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0595, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0498, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0762, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0775, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0467, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0544, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0627, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0595, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0700, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0470, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0490, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0803, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0385, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0509, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0382, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0783, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0536, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0530, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0452, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0606, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0532, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0588, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0404, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0642, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0531, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0482, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0762, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0614, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0804, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0556, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0527, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0471, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0409, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0408, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0506, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0539, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0605, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0456, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0471, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0464, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0467, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0726, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0497, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0881, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0606, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0382, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0409, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0639, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0480, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0605, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0478, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0455, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0922, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0408, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0536, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0635, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0455, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0427, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0670, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0845, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0506, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0472, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0791, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0562, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0739, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0768, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0614, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0553, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0392, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0826, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0691, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0677, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0529, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0811, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0410, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0647, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0634, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0585, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0718, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0589, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0526, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0398, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0472, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0624, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0792, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0807, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0497, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0456, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0638, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0408, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0579, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0648, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0443, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0517, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0485, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0647, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0625, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0902, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0484, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0609, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0847, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0749, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0452, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0460, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0663, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0475, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0497, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0509, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0446, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0412, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0611, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0543, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0953, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0436, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0664, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0602, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0635, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0625, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0532, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0434, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0566, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0407, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0551, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0534, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0674, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0449, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0590, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0483, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0490, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0429, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0637, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0385, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0663, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0576, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0613, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0598, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0607, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0399, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0524, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0456, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0599, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0595, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0401, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0440, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0630, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0524, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0454, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0508, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0604, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0862, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0764, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0385, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0413, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0424, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0430, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0517, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0770, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0568, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0716, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0470, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0670, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0449, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0985, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0466, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0642, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0517, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0881, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0516, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0701, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0572, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0403, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0671, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0540, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0426, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0518, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0540, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0631, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0635, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0485, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0661, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0544, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0427, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0566, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0495, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0699, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0453, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0438, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0421, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0719, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0395, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0700, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0449, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0496, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0943, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0781, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0524, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0521, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0408, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0493, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0690, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0627, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0813, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0653, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0438, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0483, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0621, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0538, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0535, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0833, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0392, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0484, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0408, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0482, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0526, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0553, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0446, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0473, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0640, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0723, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0438, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0721, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0461, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0597, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0548, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0464, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0404, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0437, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0605, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0706, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0568, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0529, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0555, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0611, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0457, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0553, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0503, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0723, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0800, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0424, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0849, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0702, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0420, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0548, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0794, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0524, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0489, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0506, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0507, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0634, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0596, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0616, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0998, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0540, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0412, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0565, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0557, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0607, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0453, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0582, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0522, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0412, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0599, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0600, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0513, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0425, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0648, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0656, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0449, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0533, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0869, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0481, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0604, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0678, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0853, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0505, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0590, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0508, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0735, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0398, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0548, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0535, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0470, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0645, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0735, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0796, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0496, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0527, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0580, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0437, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0443, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0637, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0422, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0851, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0430, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0569, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0478, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0707, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0385, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0494, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0910, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0393, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0505, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0802, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0590, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0565, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0678, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0453, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0678, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0478, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0629, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0484, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0517, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0585, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0856, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0565, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0399, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0543, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0481, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0706, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0569, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0766, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0450, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0704, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0475, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0473, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0513, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0449, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0445, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0455, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0626, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0678, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0469, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0737, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0427, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0628, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0602, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0536, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0608, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0476, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0733, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0553, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0563, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0437, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0715, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0491, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0652, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0400, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0424, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0384, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0547, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0714, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0611, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0389, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0401, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0752, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0610, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0658, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0487, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0418, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0423, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0486, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0470, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0549, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0436, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0407, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0768, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0436, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0428, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0696, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0635, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0450, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0605, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0598, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0467, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0550, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0444, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0397, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0485, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0463, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0483, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0856, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0537, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0552, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0425, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0468, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0548, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0577, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0395, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0405, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0568, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0428, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0732, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0523, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0474, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0646, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0505, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0575, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0560, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0866, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0445, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0464, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0609, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0464, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0598, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0468, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0382, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0649, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0409, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0422, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0565, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0764, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0439, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0574, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0489, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0424, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0499, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0829, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0740, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0688, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0408, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0416, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0523, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0770, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0532, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0396, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0407, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0516, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0522, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0508, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0386, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0512, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0505, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0759, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0700, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0475, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0987, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0588, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0418, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0512, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0491, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0750, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0596, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0451, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0709, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0786, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0596, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0478, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0471, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0518, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0622, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0699, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0410, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0512, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0619, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0629, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0687, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0685, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0463, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0538, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0449, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0394, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0453, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0601, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0438, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0747, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0465, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0423, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0509, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0632, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0398, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0708, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0572, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0602, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0627, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0950, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0435, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0687, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0457, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0667, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0499, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0472, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0635, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0525, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0449, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0556, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0679, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0651, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0668, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0393, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0533, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0397, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0453, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0579, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0433, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0771, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0608, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0611, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0642, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0433, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0815, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0575, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0443, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0546, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0523, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0508, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0453, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0756, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0551, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0737, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0834, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0474, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0551, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0574, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0436, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0409, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0607, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0505, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0473, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0836, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0404, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0644, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0761, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0503, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0497, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0725, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0407, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0683, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0411, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0519, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0501, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0492, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0479, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0766, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0442, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0837, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0535, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0405, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0466, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0602, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0403, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0645, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0783, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0454, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0666, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0499, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0495, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0454, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0402, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0419, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0897, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0491, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0552, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0602, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0458, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0517, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0421, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0506, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0542, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0643, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0487, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0747, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0545, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0587, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0652, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0706, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0426, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0389, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0531, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0790, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0940, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0396, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0541, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0410, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0462, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0469, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0475, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0396, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0615, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0460, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0392, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0469, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0419, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0447, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0973, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0643, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0769, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0461, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0731, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0629, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0690, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0416, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0742, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0616, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0480, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0539, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0505, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0839, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0379, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0689, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0425, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0467, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0590, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0597, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0684, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0612, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0407, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0512, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0481, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0412, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0611, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0505, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0514, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0594, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0500, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0496, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0701, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0386, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0402, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0653, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0470, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0509, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0456, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0442, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0508, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0495, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0408, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0471, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0434, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0560, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0404, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0697, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0620, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0626, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0938, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0704, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0693, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0801, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0509, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0434, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0561, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0477, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0932, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0476, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0930, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0841, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0516, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0408, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0513, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0394, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0414, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0620, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0557, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0656, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0852, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0544, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0579, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0415, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0504, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0493, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0848, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0594, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0506, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0606, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0407, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0560, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0387, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0424, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0752, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0429, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0554, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0567, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0569, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0467, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0438, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0453, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0549, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0557, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0665, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0495, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0474, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0593, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0556, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0398, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0594, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0588, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0758, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0432, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0465, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0507, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0511, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0436, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0458, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0461, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0471, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0488, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0524, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0690, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0636, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0698, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0485, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0556, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0441, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0536, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0394, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0604, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0471, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0779, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0402, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0388, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0414, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0396, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0516, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0458, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0473, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0615, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0399, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0384, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0593, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0508, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0424, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0413, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0421, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0535, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0455, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0510, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0947, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0523, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0419, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0671, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0542, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0405, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0488, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0590, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0610, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0402, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0473, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0670, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0499, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0585, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0439, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0467, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0604, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0762, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0526, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0567, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0385, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0453, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0710, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0616, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0921, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0610, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0412, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0449, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0496, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0699, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0479, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0436, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0394, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0583, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0476, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0661, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0434, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0421, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0409, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0691, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0457, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0424, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0651, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0703, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0634, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0411, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0427, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0429, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0646, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0662, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0438, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0437, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0400, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0670, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0388, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0442, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0545, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0535, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0413, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0659, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0464, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0481, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0684, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0624, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0498, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0531, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0751, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0395, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0685, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0709, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0670, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0611, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0492, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0449, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0508, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0586, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0427, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0424, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0702, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0630, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0551, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0984, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0445, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0591, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0476, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0497, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0489, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0702, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0437, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0492, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0544, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0437, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0609, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0400, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0680, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0537, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0979, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0435, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0429, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0578, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0448, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0576, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0592, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0604, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0530, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0753, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0472, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0412, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0645, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0394, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0473, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0711, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0507, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0430, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0660, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0619, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0754, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0697, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0499, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0494, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0413, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0704, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0601, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0445, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0574, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0412, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0389, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0389, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0642, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0423, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0789, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0422, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0513, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0389, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0817, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0855, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0489, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0593, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0398, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0390, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0448, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0696, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0461, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0526, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0928, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0514, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0462, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0561, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0456, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0554, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0431, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0497, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0562, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0747, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0451, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0451, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0536, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0398, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0602, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0575, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0500, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0777, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0410, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0617, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0467, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0512, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0463, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0464, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0379, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0423, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0772, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0586, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0751, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0390, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0617, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0898, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0388, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0688, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0477, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0578, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0529, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0424, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0402, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0596, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0867, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0510, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0614, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0493, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0624, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0403, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0436, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0439, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0473, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0650, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0551, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0892, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0648, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0537, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0702, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0448, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0504, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0430, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0587, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0651, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0641, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0716, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0464, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0666, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0587, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0633, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0491, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0784, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0463, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0565, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0516, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0453, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0572, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0626, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0531, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0473, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0411, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0678, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0793, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0607, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0498, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0484, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0815, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0676, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0536, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0753, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0424, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0690, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0560, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0538, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0494, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0445, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0533, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0497, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0453, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0731, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0463, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0808, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0379, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0556, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0646, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0534, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0774, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0814, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0717, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0602, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0700, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0557, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0437, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0414, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0471, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1007, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0532, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1014, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0784, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0674, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0523, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0641, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0558, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0419, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0627, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0510, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0587, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0565, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0752, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0480, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0629, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0758, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0607, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0750, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0455, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0480, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0483, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0804, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0420, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0615, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0384, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0426, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0658, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0439, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0477, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0455, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0533, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0630, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0589, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0596, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0395, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0565, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0428, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0878, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0390, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0618, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0602, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0913, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0618, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0519, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0413, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0741, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0649, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0812, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0756, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0523, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0658, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0728, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0670, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0869, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0618, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0832, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0611, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0438, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0505, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0464, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0971, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0427, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0892, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0498, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0502, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0633, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0403, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0662, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0469, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0597, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0679, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0588, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0430, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0674, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0703, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0486, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0692, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0597, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0440, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0426, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0391, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0553, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0477, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0492, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0581, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0615, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0608, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0424, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0616, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0627, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0435, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0832, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0761, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0852, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0513, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0694, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0608, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0427, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0730, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0512, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0845, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0448, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0652, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0537, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0449, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0407, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0428, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0392, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0401, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0385, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0878, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0536, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0494, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0464, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0573, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0568, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0802, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0943, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0531, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0482, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0403, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0578, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0574, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0494, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0579, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0566, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0896, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0660, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0887, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0922, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0449, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0541, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0707, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0464, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0533, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0554, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0412, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0604, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0569, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0417, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0424, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0880, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0598, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0515, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0516, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0873, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0478, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0605, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0802, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0528, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0751, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0551, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0840, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0415, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0441, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0630, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0628, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0650, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0459, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0717, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0425, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0561, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0723, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0483, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0501, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0513, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0620, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0438, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0555, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0583, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0610, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0564, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0503, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0489, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0636, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0393, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0606, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0648, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0623, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0843, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0969, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0688, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0806, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0831, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0416, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0396, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0501, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0538, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0392, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0576, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0556, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0401, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0482, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0649, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0437, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0538, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0402, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0443, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0519, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0388, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0455, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0604, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0510, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0962, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0660, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0747, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0447, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0537, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0687, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0523, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0420, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0754, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0583, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0464, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0610, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0571, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0600, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0503, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0396, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0555, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0733, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0532, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0480, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0470, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0776, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0662, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0613, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0426, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0554, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0664, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0456, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0454, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0379, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0735, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0491, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0665, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0726, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0448, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0637, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0568, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0751, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0454, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0492, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0446, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0379, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0408, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0404, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0387, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0843, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0576, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0737, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0450, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0428, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0394, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0643, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0462, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0411, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0760, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0722, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0468, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0824, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0382, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0680, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0418, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0559, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0657, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0508, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0413, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0604, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0409, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0397, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0603, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0646, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0791, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0510, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0440, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0922, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0695, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0552, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0731, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0889, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0449, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0527, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0800, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0550, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0469, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0596, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0460, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0440, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0831, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0495, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0633, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0392, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0387, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0512, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0425, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0462, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0687, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0535, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0804, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0557, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0540, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0519, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0804, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1011, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0403, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0641, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0502, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0447, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0382, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0410, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0621, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0769, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0493, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0632, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0546, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0745, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0439, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0805, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0443, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0474, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0729, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0491, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0408, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0609, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0555, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0439, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0393, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0676, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0511, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0607, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0444, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0482, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0515, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0458, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0630, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0391, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0442, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0390, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0713, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0483, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0699, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0533, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0782, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0412, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0407, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0556, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0433, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0617, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0530, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0635, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0465, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0433, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0497, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0963, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0744, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0426, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0489, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0664, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0590, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0419, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0387, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0675, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0779, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0498, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0398, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0702, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0565, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0556, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0597, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0403, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0709, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0712, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0441, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0726, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0742, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0672, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0520, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0429, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0821, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0608, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0507, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0682, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0533, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0889, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0513, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0513, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0513, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0468, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0542, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0599, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0589, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0506, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0508, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0625, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0448, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0810, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0657, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0435, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0605, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0972, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0610, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0457, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0582, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0768, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0396, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0607, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0477, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0471, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0450, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0693, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0503, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0706, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0552, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0426, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0594, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0619, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0398, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0444, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0668, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0605, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0478, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0485, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0632, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0486, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0488, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0524, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0409, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0564, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0419, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0719, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0657, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0436, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0722, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0838, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0477, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0535, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0445, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0404, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0483, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0399, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0432, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0421, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0501, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0421, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0431, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0543, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0521, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0713, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0709, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0429, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0551, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0477, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0636, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0593, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0507, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0716, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0486, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0473, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0552, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0626, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0534, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0528, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0768, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0992, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0719, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0457, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0429, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0777, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0599, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0865, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0507, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0455, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0497, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0456, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0421, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0397, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0485, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0470, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0450, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0668, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0392, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0481, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0489, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0390, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0493, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0493, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0479, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0515, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0414, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0394, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0887, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0396, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0516, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0490, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0465, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0495, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0422, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0809, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0753, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0392, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0428, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0457, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0484, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0388, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0616, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0546, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0696, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0498, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0494, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0585, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0399, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0397, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0479, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0394, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0529, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0788, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0750, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0500, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0497, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0451, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0579, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0440, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0461, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0870, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0477, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0587, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0390, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0675, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0486, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0543, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0917, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0521, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0469, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0942, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0782, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0558, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0385, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0632, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0547, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0659, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0503, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0614, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0558, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0462, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0570, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0616, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0502, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0595, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0534, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0564, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0514, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0591, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0599, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0527, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0496, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0540, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0560, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0677, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0560, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0422, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0928, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0597, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0660, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0417, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0494, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0470, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0568, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0558, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0529, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0659, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0424, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0591, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0778, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0502, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0485, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0473, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0484, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0964, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0552, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0514, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0509, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0399, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0792, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0522, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0539, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0460, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0437, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0408, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0528, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0399, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0510, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0794, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0577, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0557, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0493, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0663, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0518, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0485, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0608, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0477, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0748, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0567, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0688, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0919, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0495, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0447, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0573, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0420, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0526, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0790, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0622, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0542, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0634, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0847, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0415, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0406, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0747, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0424, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0786, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0751, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0703, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0660, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0390, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0548, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0393, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0655, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0794, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0555, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0434, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0446, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0481, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0656, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0535, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0627, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0407, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0498, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0515, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0815, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0401, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0427, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0658, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0565, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0506, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0025, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0406, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0501, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0441, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0621, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0484, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0406, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0774, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0587, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0468, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0468, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0514, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0565, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0543, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0663, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0481, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0882, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0466, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0851, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0868, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0520, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0464, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0382, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0607, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0656, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0386, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0395, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0432, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0549, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0587, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0503, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0791, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0616, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0431, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0889, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0613, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0395, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0575, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0553, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0412, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0627, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0539, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0548, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0497, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0605, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0497, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0420, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0789, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0824, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0479, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0466, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0537, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0385, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0570, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0486, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0950, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0578, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0431, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0636, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0388, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0787, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0420, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0556, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0526, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0467, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0772, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0886, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0812, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0387, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0477, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0472, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0498, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0392, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0483, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0430, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0396, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0419, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0453, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0462, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0606, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0554, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0452, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0690, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0464, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0382, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0679, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0476, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0411, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0446, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0548, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0796, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0391, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0420, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0389, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0436, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0555, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0747, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0634, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0425, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0636, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0601, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0649, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0469, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0465, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0662, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0649, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0574, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0635, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0383, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0585, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0645, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0404, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0691, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0592, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0520, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0634, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0771, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0388, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0450, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0489, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0558, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0454, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0587, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0413, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0439, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0469, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0678, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0395, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0407, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0707, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0610, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0543, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0420, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0597, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0529, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0701, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0392, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0481, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0475, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0482, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0491, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0707, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0509, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0651, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0825, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0448, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0563, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0545, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0741, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0441, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0664, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0510, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0796, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0405, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0404, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0595, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0454, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0450, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0442, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0396, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0715, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0429, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0853, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0450, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0653, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0540, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0508, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0466, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0683, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0545, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0643, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0440, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0475, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0559, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0560, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0541, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0596, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0550, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0521, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0488, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0399, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0453, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0662, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0518, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0488, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0810, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0534, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0433, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0393, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0581, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0417, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0424, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0439, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0689, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0733, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0434, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0646, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0761, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0425, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0479, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0746, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0528, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0420, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0428, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0518, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0651, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0579, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0389, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0423, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0652, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0478, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0458, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0452, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0528, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0550, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0505, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0809, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0892, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0673, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0429, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0542, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0452, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0385, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0727, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0408, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0427, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0398, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0463, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0441, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0500, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0443, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0460, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0945, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0490, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0527, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0543, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0442, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0601, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0746, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0816, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0567, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0585, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0545, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0592, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0672, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0401, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0416, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0435, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0538, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0718, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0431, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0438, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0638, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0452, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0711, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0573, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0528, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0507, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0408, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0424, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0571, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0392, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0521, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0526, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0429, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0440, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0478, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0476, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0505, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0545, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0418, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0793, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0591, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0459, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0406, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0847, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0411, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0522, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0494, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0397, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0584, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0879, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0645, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0769, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0638, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0491, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0667, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0473, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0798, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0406, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0673, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0621, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0563, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0531, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0419, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0590, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0650, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0710, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0436, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0540, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0761, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0546, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0850, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0454, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0566, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0637, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0397, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0399, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0431, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0582, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0425, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0439, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0491, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0385, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0636, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0412, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0476, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0431, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0464, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0503, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0648, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0394, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0434, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0569, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0651, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0797, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0571, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0813, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0500, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0665, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0544, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0428, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0548, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0406, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0406, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0724, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0414, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0537, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0964, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0456, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0476, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0566, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0601, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0454, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0682, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0464, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0457, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0967, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0594, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0490, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0425, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0405, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0745, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0466, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0462, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0833, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0487, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0449, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0473, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0501, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0401, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0431, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0785, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0844, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0471, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0483, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0517, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0391, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0433, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0885, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0562, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0429, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0473, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0394, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0579, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0421, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0472, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0459, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0587, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0765, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0387, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0425, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0543, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0407, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0615, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0520, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0522, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0546, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0437, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0409, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0686, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0424, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0478, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0949, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0431, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0771, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0675, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0409, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0438, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0386, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0920, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0387, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0442, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0412, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0382, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0538, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0525, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0479, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0466, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0385, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0469, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0395, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0732, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0689, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0505, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0504, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0511, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0705, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0406, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0383, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0455, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0706, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0531, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0467, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0536, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0549, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0583, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0440, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0747, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0620, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0406, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0466, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0774, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0577, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0394, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0519, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0421, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0443, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0756, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0489, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0423, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0609, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0576, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0396, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0563, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0439, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0395, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0479, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0649, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0598, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0555, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0474, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0420, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0393, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0406, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0392, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0601, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0477, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0440, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0382, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0581, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0497, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0397, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0449, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0411, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0384, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0414, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0518, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0544, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0900, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0426, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0602, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0568, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0460, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0716, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0410, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0601, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0441, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0803, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0454, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0514, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0392, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0477, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0540, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0986, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0608, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0389, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0486, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0451, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0442, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0561, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0441, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0392, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0428, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0551, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0426, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0815, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0813, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0383, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0541, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0441, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0425, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0403, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0425, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0458, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0491, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0422, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0503, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0617, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0391, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0413, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0594, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0477, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0403, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0455, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0731, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0436, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0463, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0467, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0460, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0563, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0466, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0664, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0522, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0470, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0661, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0721, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0525, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0563, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0603, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0475, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0505, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0418, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0504, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0517, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0537, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0525, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0410, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0396, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0632, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0483, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0606, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0693, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0467, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0440, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0426, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0458, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0398, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0457, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0508, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0448, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0389, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0460, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0593, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0690, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0382, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0438, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0400, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0482, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0487, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0532, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0484, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0453, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0660, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0548, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0504, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0657, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0516, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0396, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0527, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0393, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0568, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0519, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0614, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0689, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0475, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0740, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0559, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0522, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0529, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0588, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0558, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0771, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0578, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0581, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0418, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0403, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0644, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0510, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0672, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0594, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0446, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0414, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0527, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0788, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0387, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0495, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0601, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0448, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0495, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0672, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0632, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0613, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0808, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0440, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0382, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0412, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0607, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0473, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0593, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0518, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0433, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0465, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0398, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0392, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0512, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0580, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0456, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0396, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0498, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0599, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0405, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0400, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0387, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0509, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0675, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0384, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0577, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0432, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0436, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0430, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0503, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0450, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0402, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0403, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0645, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0494, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0447, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0496, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0447, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0448, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0402, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0614, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0543, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0712, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0527, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0637, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0456, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0671, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0472, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0437, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0405, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0025, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0412, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0750, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0507, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0408, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0450, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0534, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0487, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0396, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0521, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0427, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0431, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0505, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0582, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0530, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0403, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0484, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0578, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0502, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0487, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0395, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0463, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0480, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0611, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0416, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0457, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0624, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0430, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0397, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0529, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0467, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0456, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0517, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0579, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0798, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0447, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0737, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0659, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0471, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0489, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0391, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0554, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0472, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0445, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0613, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0394, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0390, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0417, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0516, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0600, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0440, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0857, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0432, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0444, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0623, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0494, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0612, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0442, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0452, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0409, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0396, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0663, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0397, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0664, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0449, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0410, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0476, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0428, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0788, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0464, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0485, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0425, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0440, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0461, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0425, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0653, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0453, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0636, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0447, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0451, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0404, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0388, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0426, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0647, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0497, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0487, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0496, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0433, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0589, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0515, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0443, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0414, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0404, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0523, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0422, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0389, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0424, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0424, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0593, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0404, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0411, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0547, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0413, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0458, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0513, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0457, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0456, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0841, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0493, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0404, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0598, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0412, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0445, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0547, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0532, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0575, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0640, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0504, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0712, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0463, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0594, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0451, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0450, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0413, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0456, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0407, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0403, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0389, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0443, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0468, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0408, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0508, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0429, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0798, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0400, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0525, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0406, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0643, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0620, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0779, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0500, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0412, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0449, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0469, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0584, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0570, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0467, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0548, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0559, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0640, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0476, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0648, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0481, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0483, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0521, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0392, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0654, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0494, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0740, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0621, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0396, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0450, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0400, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0468, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0607, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0396, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0439, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0610, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0673, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0481, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0457, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0441, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0445, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0572, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0712, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0412, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0386, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0384, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0482, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0580, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0666, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0538, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0651, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0448, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0502, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0388, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0452, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0535, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0453, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0476, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0523, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0478, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0433, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0524, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0791, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0414, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0507, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0459, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0688, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0429, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0420, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0535, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0420, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0410, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0559, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0414, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0402, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0547, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0399, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0409, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0415, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0553, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0379, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0437, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0762, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0412, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0512, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0457, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0419, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0773, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0833, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0392, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0538, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0507, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0440, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0420, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0443, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0653, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0595, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0492, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0398, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0615, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0525, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0541, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0411, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0411, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0751, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0577, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0451, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0724, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0568, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0448, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0569, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0534, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0535, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0562, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0476, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0388, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0424, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0391, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0389, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0465, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0425, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0446, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0615, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0507, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0406, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0599, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0450, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0514, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0396, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0525, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0429, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0382, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0460, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0447, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0706, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0482, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0642, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0402, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0452, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0432, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0615, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0451, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0609, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0470, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0677, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0415, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0454, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0466, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0420, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0431, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0389, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0422, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0443, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0385, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0629, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0536, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0566, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0610, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0472, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0469, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0568, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0446, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0775, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0450, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0647, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0426, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0595, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0584, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0505, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0498, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0504, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0417, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0401, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0391, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0523, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0522, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0428, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0464, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0506, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0505, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0469, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0429, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0532, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0488, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0414, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0700, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0407, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0503, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0447, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0429, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0437, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0800, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0455, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0470, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0569, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0455, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0502, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0478, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0526, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0412, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0462, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0401, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0615, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0454, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0720, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0414, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0602, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0824, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0564, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0394, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0431, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0444, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0387, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0433, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0391, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0616, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0444, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0597, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0428, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0554, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0379, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0626, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0491, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0510, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0437, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0671, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0446, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0575, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0645, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0507, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0487, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0632, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0443, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0468, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0526, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0389, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0494, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0476, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0532, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0457, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0527, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0572, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0424, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0414, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0587, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0398, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0448, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0428, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0541, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0406, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0488, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0492, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0423, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0478, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0411, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0412, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0640, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0421, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0442, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0390, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0686, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0512, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0566, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0385, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0412, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0444, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0477, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0802, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0391, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0732, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0590, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0594, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0489, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0401, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0403, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0397, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0424, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0569, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0558, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0440, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0530, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0458, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0382, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0424, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0460, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0538, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0431, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0396, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0402, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0518, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0454, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0425, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0490, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0411, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0406, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0398, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0544, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0552, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0430, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0752, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0446, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0450, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0440, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0422, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0423, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0386, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0619, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0392, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0397, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0459, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0431, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0434, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0404, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0489, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0403, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0388, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0383, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0411, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0504, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0425, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0571, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0527, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0750, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0379, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0391, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0429, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0483, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0473, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0392, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0564, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0402, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0527, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0413, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0387, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0392, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0437, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0448, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0409, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0411, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0454, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0438, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0680, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0517, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0553, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0441, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0606, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0023, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0483, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0379, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0388, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0404, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0495, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0406, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0746, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0389, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0679, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0487, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0393, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0540, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0546, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0660, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0387, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0539, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0471, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0398, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0474, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0665, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0419, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0424, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0472, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0451, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0665, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0490, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0467, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0476, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0461, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0419, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0414, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0409, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0482, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0516, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0391, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0386, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0454, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0714, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0025, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0776, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0539, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0437, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0538, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0457, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0548, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0436, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0396, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0458, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0441, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0400, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0395, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0464, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0433, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0519, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0578, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0385, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0528, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0430, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0428, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0432, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0408, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0511, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0506, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0396, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0412, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0468, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0582, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0414, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0421, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0435, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0449, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0417, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0400, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0461, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0426, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0407, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0451, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0611, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0412, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0436, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0642, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0470, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0469, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0648, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0585, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0463, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0548, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0476, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0396, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0390, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0410, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0398, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0424, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0440, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0408, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0408, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0461, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0489, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0530, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0490, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0388, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0489, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0404, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0451, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0702, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0511, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0433, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0537, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0570, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0379, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0411, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0449, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0482, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0472, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0598, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0441, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0443, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0432, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0399, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0611, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0462, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0487, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0470, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0463, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0448, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0792, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0458, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0507, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0395, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0633, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0393, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0401, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0395, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0688, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0411, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0539, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0453, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0510, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0634, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0440, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0420, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0437, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0387, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0495, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0529, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0393, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0466, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0514, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0484, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0406, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0780, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0507, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0559, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0522, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0427, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0429, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0390, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0422, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0394, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0534, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0429, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0413, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0460, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0016, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0443, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0389, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0479, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0399, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0412, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0393, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0425, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0387, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0440, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0383, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0425, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0455, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0545, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0567, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0525, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0387, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0387, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0394, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0649, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0557, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0469, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0480, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0527, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0536, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0589, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0561, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0390, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0605, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0405, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0476, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0401, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0513, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0384, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0485, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0467, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0688, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0803, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0533, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0443, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0507, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0398, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0397, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0427, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0424, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0447, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0385, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0425, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0419, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0421, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0490, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0389, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0493, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0605, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0536, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0418, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0430, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0598, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0450, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0474, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0427, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0704, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0391, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0418, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0988, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0424, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0390, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0405, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0423, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0466, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0417, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0388, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0557, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0620, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0395, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0494, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0508, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0386, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0419, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0412, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0414, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0465, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0011, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0387, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0384, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0592, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0449, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0388, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0388, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0455, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0444, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0396, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0545, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0416, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0507, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0434, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0396, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0411, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0514, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0454, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0955, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0542, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0607, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0528, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0451, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0816, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0430, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0389, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0459, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0392, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0385, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0389, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0513, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0390, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0473, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0386, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0428, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0532, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0510, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0449, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0390, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0410, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0396, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0668, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0436, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0427, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0004, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0393, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0434, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0403, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0409, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0014, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0478, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0471, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0611, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0533, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0024, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0482, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0426, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0499, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0395, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0482, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0424, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0426, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0632, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0389, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0612, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0421, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0496, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0384, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0387, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0384, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0444, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0424, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0404, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0685, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0840, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0409, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0628, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0405, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0477, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0444, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0388, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0394, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0390, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0405, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0547, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0400, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0407, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0483, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0462, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0014, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0386, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0500, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0573, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0460, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0468, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0444, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0421, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0425, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0491, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0435, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0420, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0390, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0420, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0411, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0452, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0384, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0439, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0418, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0550, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0492, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0460, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0409, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0416, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0594, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0520, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0459, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0452, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0660, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0539, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0419, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0433, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0532, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0388, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0405, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0398, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0435, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0421, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0017, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0435, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0401, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0016, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0445, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0475, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0539, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0440, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0750, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0699, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0471, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0436, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0417, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0422, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0387, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0442, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0477, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0595, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0541, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0395, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0388, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0394, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0413, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0401, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0422, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0413, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0398, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0484, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0496, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0400, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0957, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0420, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0428, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0435, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0458, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0458, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0402, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0526, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0391, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0383, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0613, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0442, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0520, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0446, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0379, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0392, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0522, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0483, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0619, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0412, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0465, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0398, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0401, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0388, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0384, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0522, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0436, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0425, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0466, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0534, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0541, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0394, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0540, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0484, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0406, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0413, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0404, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0458, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0399, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0530, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0401, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0429, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0444, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0422, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0512, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0405, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0398, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0415, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0479, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0636, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0454, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0557, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0516, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0443, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0383, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0392, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0465, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0527, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0398, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0505, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0426, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0011, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0418, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0472, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0676, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0386, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0406, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0390, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0400, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0420, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0387, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0445, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0384, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0525, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0470, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0522, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0438, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0396, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0415, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0386, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0504, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0408, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0422, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0431, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0411, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0416, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0401, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0426, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0492, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0022, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0389, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0385, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0448, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0462, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0418, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0396, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0408, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0555, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0379, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0459, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0464, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0488, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0451, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0452, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0418, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0478, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0670, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0409, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0007, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0492, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0488, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0428, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0516, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0444, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0464, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0398, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0509, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0019, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0526, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0406, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0453, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0382, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0407, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0544, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0478, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0479, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0464, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0473, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0427, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0462, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0518, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0496, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0393, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0404, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0419, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0405, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0402, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0407, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0496, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0734, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0452, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0556, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0451, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0466, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0674, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0486, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0384, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0449, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0395, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0006, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0025, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0499, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0427, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0454, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0564, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0390, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0390, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0427, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0398, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0383, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0413, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0414, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0456, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0424, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(4.9434e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0390, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0496, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0562, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0449, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0592, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0398, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0571, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0431, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0449, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0458, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0469, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0512, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0411, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0024, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0417, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0403, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0424, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0428, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0440, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0385, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0618, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0539, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0545, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0410, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0442, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0542, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0431, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0469, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0559, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0543, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0430, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0434, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0479, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0391, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0399, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0499, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0393, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0410, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0397, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0766, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0467, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0592, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0429, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0511, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0411, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0481, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0434, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0484, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0515, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0401, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0403, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0543, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0471, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0446, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0706, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0677, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0500, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0445, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0508, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0401, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0453, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0576, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0410, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0590, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0382, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0443, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0430, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0461, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0437, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0419, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0512, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0403, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0476, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0389, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0410, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0470, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0702, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0402, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0481, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0553, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0441, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0398, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0393, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0510, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0534, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0478, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0523, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0419, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0408, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0448, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0402, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0479, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0413, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0550, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0450, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0538, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0463, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0403, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0500, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0432, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0574, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0407, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0418, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0450, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0385, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0496, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0419, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0444, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0504, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0527, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0523, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0382, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0473, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0017, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0025, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0441, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0432, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0025, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0446, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0565, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0390, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0024, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0445, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0533, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0390, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0425, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0559, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0391, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0391, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0516, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0393, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0450, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0480, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0383, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0579, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0403, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0495, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0465, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0472, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0456, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0441, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0514, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0398, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0394, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0409, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0415, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0471, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0020, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0404, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0456, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0459, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0410, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0481, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0587, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0394, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0021, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0524, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0413, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0467, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0419, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0500, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0685, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0448, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0531, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0023, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0661, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0505, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0439, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0394, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0403, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0385, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0434, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0517, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0406, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0408, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0450, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0401, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0634, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0516, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0014, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0390, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0654, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0490, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0427, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0425, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0530, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0384, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0520, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0415, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0557, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0010, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0400, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0438, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0476, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0440, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0428, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0024, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0462, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0396, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0517, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0546, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0425, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0382, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0674, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0400, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0387, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0587, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0388, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0453, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0503, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0507, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0439, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0524, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0428, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0384, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0456, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0400, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0433, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0391, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0556, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0440, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0501, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0467, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0406, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0405, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0448, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0533, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0653, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0390, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0382, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0451, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0478, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0466, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0442, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0492, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0480, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0437, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0546, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0427, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0467, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0429, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0499, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0420, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0469, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0393, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0382, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0402, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0473, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0434, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0462, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0431, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0411, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0414, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0411, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0386, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0391, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0386, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0390, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0399, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0479, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0459, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0573, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0537, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0501, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0600, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0021, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0387, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0433, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0660, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0470, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0424, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0431, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0551, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0458, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0428, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0705, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0408, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0429, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0500, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0492, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0452, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0410, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0501, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0449, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0404, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0488, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0445, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0833, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0501, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0490, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0529, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0476, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0402, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0532, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0404, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0457, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0431, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0406, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0544, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0409, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0607, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0421, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0443, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0387, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0385, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0595, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0430, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0384, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0019, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0441, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0390, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0382, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0008, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0479, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0445, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0505, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0524, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0518, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0560, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0390, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0507, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0446, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0609, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0583, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0403, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0385, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0763, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0391, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0450, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0596, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0023, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0508, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0473, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0427, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0402, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0484, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0612, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0600, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0804, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0541, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0432, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0426, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0626, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0526, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0915, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0466, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0411, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0455, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0481, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0416, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0024, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0877, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0395, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0022, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0471, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0438, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0645, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0949, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0385, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0391, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0505, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0423, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0015, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0405, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0794, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0411, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0401, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0422, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0550, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0889, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0471, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0386, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0395, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0436, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0410, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0414, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0492, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0407, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0420, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0527, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0397, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0422, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0499, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0539, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0403, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0379, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0498, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0640, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0006, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0488, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0497, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0438, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0008, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0526, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0444, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0568, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0594, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0759, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0517, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0434, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0379, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0481, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0524, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0657, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0398, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0567, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0404, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0796, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0423, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0468, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0389, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0379, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0506, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0392, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0426, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0391, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0499, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0020, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0644, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0400, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0440, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0469, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0465, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0498, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0581, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0596, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0418, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0386, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0794, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0405, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0433, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0443, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0406, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0601, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0418, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0453, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0642, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0433, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0456, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0417, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0450, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0478, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0437, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0551, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0404, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0010, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0759, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0392, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0561, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0447, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0483, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0426, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0501, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0393, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0460, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0741, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0544, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0492, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0011, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0414, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0478, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0387, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0437, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0561, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0552, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0403, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0025, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0493, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0025, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0562, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0384, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0489, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0389, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0486, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0383, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0404, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0408, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0447, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0399, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0401, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0439, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0390, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0501, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0602, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0399, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0014, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0717, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0516, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0600, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0524, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0508, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0721, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0401, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0420, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0404, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0432, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0770, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0514, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0418, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0419, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0479, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0597, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0701, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0404, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0396, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0384, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0409, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0477, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0447, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0480, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0409, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0384, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0387, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0601, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0470, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0414, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0385, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0550, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0801, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0410, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0387, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0456, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0402, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0414, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0419, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0391, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0451, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0579, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0430, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0482, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0487, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0423, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0383, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0673, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0648, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0526, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0653, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0441, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0652, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0455, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0405, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0773, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0514, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0474, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0446, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0540, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0503, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0527, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0383, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0480, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0777, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0461, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0458, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0483, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0379, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0387, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0468, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0406, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0485, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0497, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0423, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0583, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0716, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0411, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0408, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0449, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0505, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0397, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0391, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0402, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0852, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0020, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0379, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0502, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0400, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0392, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0440, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0474, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0394, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0395, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0419, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0576, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0436, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0014, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0535, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0516, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0401, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0534, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0394, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0456, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0023, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0521, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0016, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0398, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0385, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0452, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0688, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0379, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0506, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0531, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0423, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0419, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0495, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0404, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0446, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0399, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0399, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0467, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0759, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0388, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0422, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0411, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0464, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0529, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0649, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0393, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0469, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0406, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0489, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0411, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0493, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0400, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0510, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0438, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0391, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0795, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0398, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0396, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0634, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0390, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0423, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0405, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0383, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0388, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0480, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0409, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0386, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0514, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0430, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0471, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0411, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0536, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0466, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0456, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0480, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0608, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0458, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0751, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0482, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0394, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0411, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0389, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0395, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0392, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0432, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0710, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0772, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0390, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0412, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0487, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0420, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0012, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0694, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0625, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0385, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0760, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0767, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0403, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0449, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0405, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0468, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0616, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0479, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0386, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0816, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0383, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0419, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0404, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0803, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0435, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0619, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0504, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0466, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0448, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0386, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0415, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0396, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0599, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0401, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0410, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0419, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0384, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0406, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0697, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0624, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0422, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0417, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0395, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0451, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0440, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0600, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0399, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0494, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0579, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0568, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0442, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0442, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0386, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0541, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0391, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0451, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0023, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0391, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0558, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0508, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0578, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0460, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0511, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0582, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0459, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0428, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0012, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0697, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0518, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0023, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0605, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0388, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0404, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0406, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0755, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0729, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0385, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0429, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0435, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0391, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0526, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0409, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0384, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0412, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0418, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0435, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0482, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0389, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0453, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0504, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0441, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0674, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0382, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0010, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0770, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0469, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0484, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0457, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0392, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0649, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0398, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0511, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0459, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0512, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0421, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0020, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0492, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0433, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0515, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0427, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0463, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0489, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0607, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0472, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0415, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0403, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0840, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0497, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0477, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0501, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0390, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0400, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0512, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0423, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0470, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0527, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0383, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0418, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0511, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0669, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0402, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0644, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0530, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0483, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0479, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0469, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0398, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0459, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0950, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0418, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0440, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0412, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0502, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0604, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0415, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0551, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0494, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0468, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0461, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0399, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0419, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0010, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0548, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0404, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0445, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0527, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0643, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0420, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0407, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0430, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0581, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0423, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0535, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0012, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0388, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0588, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0399, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0457, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0438, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0435, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0726, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0499, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0459, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0398, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0453, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0463, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0427, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0520, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0441, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0404, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0715, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0483, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0465, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0417, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0389, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0477, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0454, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0467, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0387, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0462, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0507, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0582, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0550, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0553, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0523, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0439, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0498, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0024, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0693, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0379, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0442, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0469, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0399, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0457, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0425, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0497, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0495, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0022, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0467, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0548, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0405, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0020, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0396, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0527, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0025, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0464, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0412, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0418, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0589, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0458, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0015, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0488, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0433, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0013, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0487, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0550, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0632, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0494, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0391, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0574, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0519, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0427, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0408, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0536, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0492, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0397, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0652, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0424, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0488, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0518, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0399, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0392, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0482, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0400, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0457, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0479, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0530, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0456, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0431, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0483, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0456, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0498, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0415, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0410, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0579, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0456, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0426, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0673, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0671, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0575, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0396, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0382, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0448, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0479, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0528, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0568, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0394, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0405, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0465, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0386, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0586, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0512, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0450, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0412, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0399, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0459, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0414, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0499, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0499, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0595, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0415, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0513, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0410, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0740, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0025, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0410, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0655, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0390, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0615, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0410, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0580, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0573, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0399, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0478, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0417, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0483, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0387, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0391, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0600, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0412, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0379, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0478, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0508, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0561, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0407, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0518, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0429, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0579, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0600, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0680, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0443, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0388, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0425, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0417, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0407, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0535, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0509, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0415, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0450, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0479, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0563, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0419, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0463, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0388, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0487, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0517, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0382, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0418, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0421, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0483, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0415, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0407, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0383, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0618, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0417, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0392, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0487, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0595, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0686, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0018, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0402, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0397, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0383, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0387, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0553, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0496, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0379, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0437, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0463, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0416, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0444, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0414, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0588, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0458, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0394, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0634, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0428, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0722, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0506, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0448, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0402, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0452, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0646, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0506, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0433, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0435, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0401, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0437, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0415, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0582, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0533, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0411, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0442, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0383, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0402, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0419, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0391, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0435, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0438, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0401, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0394, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0447, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0464, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0528, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0418, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0443, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0425, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0382, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0433, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0436, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0462, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0413, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0454, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0418, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0382, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0456, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0488, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0492, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0438, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0534, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0474, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0499, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0416, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0384, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0405, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0622, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0411, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0414, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0458, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0473, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0790, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0425, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0393, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0414, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0413, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0397, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0410, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0428, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0437, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0560, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0445, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0464, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0405, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0664, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0419, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0445, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0455, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0473, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0527, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0579, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0424, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0379, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0438, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0504, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0469, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0473, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0544, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0392, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0419, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0459, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0437, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0392, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0509, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0432, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0392, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0391, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0420, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0423, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0444, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0399, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0435, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0477, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0385, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0644, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0551, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0437, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0405, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0453, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0416, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0470, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0410, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0580, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0497, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0485, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0452, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0510, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0389, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0396, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0425, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0023, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0392, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0403, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0419, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0525, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0409, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0417, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0484, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0385, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0383, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0496, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0492, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0437, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0546, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0501, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0503, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0387, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0423, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0450, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0419, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0538, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0407, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0424, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0431, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0485, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0392, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0429, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0404, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0443, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0464, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0410, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0462, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0406, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0437, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0016, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0406, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0498, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0459, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0421, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0427, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0408, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0410, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0419, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0585, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0429, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0560, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0439, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0488, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0511, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0397, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0464, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0485, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0430, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0437, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0612, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0426, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0433, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0424, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0396, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0386, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0653, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0486, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0548, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0523, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0535, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0389, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0690, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0425, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0447, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0022, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0401, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0469, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0571, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0463, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0426, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0016, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0391, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0436, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0463, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0418, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0747, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0389, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0522, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0450, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0503, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0465, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0669, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0447, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0591, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0403, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0598, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0564, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0416, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0386, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0634, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0515, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0386, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0382, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0458, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0491, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0608, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0481, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0482, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0457, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0474, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0411, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0466, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0532, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0521, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0409, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0436, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0559, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0585, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0560, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0478, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0512, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0625, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0585, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0421, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0415, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0458, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0386, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0450, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0580, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0415, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0467, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0412, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0421, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0508, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0387, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0421, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0493, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0459, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0424, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0577, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0507, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0409, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0425, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0466, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0391, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0500, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0449, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0467, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0388, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0402, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0457, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0388, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0666, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0442, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0489, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0389, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0471, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0382, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0464, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0512, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0382, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0444, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0386, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0499, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0397, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0464, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0469, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0388, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0433, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0521, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0520, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1016, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0605, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0624, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0419, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0431, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0545, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0498, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0396, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0391, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0515, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0447, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0426, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0493, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0535, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0379, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0406, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0553, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0391, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0399, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0389, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0422, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0379, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0403, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0383, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0390, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0390, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0528, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0443, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0424, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0394, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0607, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0416, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0389, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0454, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0501, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0435, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0396, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0397, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0463, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0436, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0430, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0521, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0430, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0543, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0484, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0409, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0403, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0387, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0440, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0414, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0472, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0538, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0525, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0395, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0572, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0501, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0476, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0398, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0439, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0413, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0398, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0424, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0464, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0399, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0466, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0614, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0622, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0495, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0491, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0394, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0392, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0609, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0468, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0434, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0383, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0433, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0532, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0574, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0677, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0514, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0442, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0398, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0466, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0538, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0612, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0394, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0447, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0013, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0453, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0429, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0485, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0439, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0384, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0477, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0415, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0406, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0447, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0434, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0407, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0421, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0406, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0508, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0417, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0546, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0476, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0410, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0491, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0454, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0566, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0606, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0409, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0478, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0024, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0434, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0389, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0452, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0558, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0413, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0742, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0408, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0435, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0400, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0505, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0449, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0407, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0436, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0475, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0528, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0503, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0425, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0637, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0642, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0379, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0406, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0505, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0591, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0424, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0435, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0443, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0424, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0474, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0388, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0476, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0382, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0419, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0399, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0417, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0451, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0415, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0471, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0421, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0416, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0427, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0504, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0614, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0610, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0403, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0985, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0458, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0485, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0430, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0395, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0658, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0485, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0399, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0500, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0479, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0479, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0400, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0763, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0567, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0529, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0603, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0527, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0439, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0424, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0424, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0440, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0536, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0411, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0477, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0401, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0388, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0534, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0415, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0496, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0421, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0391, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0420, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0458, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0511, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0488, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0479, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0680, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0429, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0463, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0593, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0443, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0567, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0473, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0414, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0413, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0443, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0431, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0398, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0489, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0453, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0384, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0382, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0515, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0415, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0422, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0382, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0482, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0024, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0480, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0534, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0513, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0393, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0455, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0412, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0397, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0493, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0412, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0421, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0682, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0439, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0390, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0660, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0413, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0449, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0523, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0420, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0400, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0388, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0451, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0591, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0541, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0925, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0456, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0410, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0543, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0524, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0023, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0008, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0014, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0413, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0498, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0385, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0558, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0427, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0583, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0506, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0480, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0596, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0417, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0483, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0467, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0431, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0582, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0470, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0469, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0411, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0413, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0407, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0400, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0476, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0490, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0487, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0503, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0413, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0471, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0472, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0422, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0571, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0429, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0023, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0510, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0428, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0401, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0016, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0465, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0412, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0387, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0399, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0025, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0608, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0409, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0580, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0431, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0460, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0645, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0401, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0421, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0443, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0468, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0015, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0508, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0538, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0424, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0397, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0007, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0476, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0545, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0394, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0399, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0383, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0459, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0402, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0456, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0014, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0513, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0550, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0024, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(3.9787e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0013, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0424, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0587, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0462, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0478, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0540, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0467, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0449, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0447, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0379, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0428, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0019, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0486, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0516, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0530, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0386, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0643, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0586, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0392, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0386, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0388, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0406, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0384, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0458, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0422, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0403, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0453, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0384, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0460, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0388, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0386, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(4.1553e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(4.1234e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0004, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0008, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0464, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0606, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0559, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0644, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0387, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0448, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0494, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0519, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0597, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0383, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0457, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0426, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0009, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0022, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0391, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0397, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0432, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0394, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0731, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0385, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0394, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0379, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(3.6231e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0396, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0025, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0385, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0497, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0385, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0399, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0010, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(4.2407e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0015, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0013, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0434, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0023, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0705, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0399, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0643, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0458, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0473, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0398, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0540, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0549, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0521, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0025, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0388, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0431, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(3.7442e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0395, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0409, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0648, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0016, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0471, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(3.7740e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0499, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0405, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0483, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0563, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(3.5401e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0023, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0021, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0015, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0794, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0422, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0402, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0422, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0497, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0006, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0547, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0748, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0449, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0484, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0531, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0009, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0408, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0467, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0481, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(4.3333e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(3.6844e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0444, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0418, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0415, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0432, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0425, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0593, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0414, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0014, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(3.9786e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0020, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(3.9000e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(4.5623e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0025, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0006, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0590, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0491, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(3.6307e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(3.5414e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0007, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0433, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0401, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0553, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0508, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0498, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0379, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0505, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0396, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0390, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0025, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0590, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0608, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0393, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0454, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(3.7064e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0011, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(3.6875e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0396, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0004, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(3.7234e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0489, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0020, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0737, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0011, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0616, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0379, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0379, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0386, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0557, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(4.2029e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0385, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0022, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0379, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0611, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0002, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(3.7297e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(3.8402e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0012, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0006, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(3.7009e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0002, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0480, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(4.0090e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0009, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0004, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0684, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0406, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0386, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0402, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0396, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0410, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0476, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0401, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0480, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0605, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0382, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0481, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0480, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0429, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0458, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0510, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0395, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0413, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0384, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0982, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0554, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0424, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0535, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0424, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0459, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0453, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0023, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0014, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(3.9176e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0016, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0018, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0017, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0434, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(3.8550e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(3.7821e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0400, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0471, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0521, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0483, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0402, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0475, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0617, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0543, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0409, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0524, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0485, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0002, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0389, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0484, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0452, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0003, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0418, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0388, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0466, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0379, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0551, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0455, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0398, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0483, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0657, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0511, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0393, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0013, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0410, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0419, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0391, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0516, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0526, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0434, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0597, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0401, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0451, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0441, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0447, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0425, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0654, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0740, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0549, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0385, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0433, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0435, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0425, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0587, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0477, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0430, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0394, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0456, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0456, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0022, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0494, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0510, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0628, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0400, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0434, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0388, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0404, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0453, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0430, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0005, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0384, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0551, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0484, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0492, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0539, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0383, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0433, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0396, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0384, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0013, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0481, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0461, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0382, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0466, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0427, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0524, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1001, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0425, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0539, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0471, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0501, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0405, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0659, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0431, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0483, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0469, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0510, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0022, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0024, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0396, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0023, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0019, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0644, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0504, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0549, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0538, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0632, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0463, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0655, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0441, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0426, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0395, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0528, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0552, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0400, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0417, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0413, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0462, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0411, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0538, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0518, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0395, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0397, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0559, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0524, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0415, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0540, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0672, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0498, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0411, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0471, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0577, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0792, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0441, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0552, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0456, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0523, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0403, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0549, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0441, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0388, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0017, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0458, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0522, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0023, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0418, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0511, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0634, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0550, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0479, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0023, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0546, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0397, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0424, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0421, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0603, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0423, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0432, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0523, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0393, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0410, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0387, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0395, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0896, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0578, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0473, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0480, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0513, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0541, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0497, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0387, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0610, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0445, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0385, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0391, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0495, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0469, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0466, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0394, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0521, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0383, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0631, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0437, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0488, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0392, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0612, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0472, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0448, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0392, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0409, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0466, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0521, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0014, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0456, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0524, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0388, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0412, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(7.7708e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0425, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0005, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0695, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0768, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0516, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0409, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0455, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0510, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0410, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0530, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0392, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0489, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0015, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0011, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0010, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0513, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0018, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0433, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0500, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0422, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0548, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0459, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0535, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0417, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0655, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0540, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0551, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0456, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0455, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0024, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0448, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0398, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0569, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0466, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0460, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0519, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0005, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0014, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0435, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(3.4261e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0416, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0500, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0461, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0428, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0399, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0440, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0398, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0750, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0023, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0392, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0458, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0397, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0426, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0663, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0489, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0630, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0834, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0396, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0408, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0441, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0868, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0587, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0457, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0007, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0013, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0445, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0464, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0392, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0432, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0024, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0439, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0452, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0473, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0528, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0656, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0386, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0424, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0567, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0690, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0022, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0404, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0024, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0526, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0607, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0419, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0001, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0459, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0012, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(2.5396e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(2.3755e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0018, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0009, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0006, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0641, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0493, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0024, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0625, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0532, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0004, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0013, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0022, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0454, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0422, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0398, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0431, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0387, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0581, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0419, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0400, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0475, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0397, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0015, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0392, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0565, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0385, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0503, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0388, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0395, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0414, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0445, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0557, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0420, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0393, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0523, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0384, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0409, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0397, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0510, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0023, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(2.7978e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0413, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0386, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0634, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0682, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0390, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0390, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0413, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0395, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0410, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0424, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0392, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0415, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0383, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0402, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0416, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0475, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0450, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0400, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0440, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0510, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0470, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0420, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0478, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0448, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0389, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0395, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0436, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0514, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0499, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0400, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0428, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0422, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0578, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0408, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0009, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0014, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0530, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(2.9487e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0012, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0511, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0017, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0494, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0499, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0447, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0753, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0387, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0385, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0773, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0408, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0422, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0675, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0602, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0408, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0626, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0501, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0452, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0390, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0392, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0390, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0396, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0815, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0501, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0404, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0480, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0447, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0404, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0426, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0525, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0383, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0408, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0383, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0379, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0434, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0387, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0384, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0415, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0526, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0618, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0424, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0639, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0505, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0468, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0448, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0406, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0547, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0388, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0393, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0449, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0411, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0405, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0591, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0454, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0504, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0438, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0602, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0437, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0416, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0398, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0712, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0400, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0384, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0379, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0393, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0022, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0605, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0438, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0483, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0014, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0901, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0433, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0384, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0500, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0414, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0379, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0535, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0710, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0485, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0523, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0597, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0491, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0411, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0530, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0413, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0386, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0433, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0992, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0383, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0385, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0382, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0507, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0443, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0383, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0417, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0437, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(9.1689e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0478, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0418, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0646, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0501, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0461, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(9.7799e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0413, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0408, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0392, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0404, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0900, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0445, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0447, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0394, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0491, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0424, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0420, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0489, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0485, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0439, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0504, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(6.9766e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0407, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0561, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0871, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0505, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0021, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0614, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0393, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0470, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0001, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0005, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0008, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0679, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0508, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0415, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0441, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0413, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0601, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0453, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0563, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0427, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0568, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0433, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0423, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0559, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0427, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0385, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0552, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0390, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0506, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0445, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0455, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0424, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0413, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0412, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0582, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0024, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0464, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0455, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0688, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0457, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0485, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0543, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0406, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0017, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0420, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0396, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0453, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0549, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0394, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0428, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0418, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0453, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0384, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0577, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0386, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0661, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0025, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0008, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0401, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0441, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0425, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0476, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0396, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0388, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0016, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(6.6400e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0542, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0464, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0019, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0009, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0409, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0429, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0379, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0387, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0403, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0874, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0382, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0418, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0424, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0393, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0571, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0398, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0416, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0386, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0416, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0458, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0477, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0455, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0787, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0385, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0877, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0421, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0385, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0529, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0017, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0556, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0003, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0016, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0695, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0454, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0422, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0470, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0445, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0441, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0529, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0401, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0454, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0019, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0528, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0527, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0600, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0418, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0522, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0406, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0459, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0387, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(6.3353e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0001, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0415, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0400, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0483, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0572, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0575, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0886, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0513, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0399, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0397, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0415, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0545, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0605, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0014, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0017, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0449, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0421, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0534, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0436, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0519, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0420, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0413, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0409, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0450, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0416, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0540, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0443, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0406, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0418, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0648, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0018, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0526, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0396, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0385, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0429, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0412, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0475, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0417, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0552, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0417, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0401, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0390, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0464, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0447, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0434, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0471, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0502, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0651, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0469, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0007, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0464, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0492, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0617, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0389, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0409, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0485, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0559, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0535, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0585, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0875, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0559, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0392, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0458, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0635, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0409, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0594, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0021, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0394, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0640, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0500, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0408, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0388, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0487, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0379, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0659, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0390, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0385, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0404, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0430, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0411, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0491, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0397, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0409, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0656, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0389, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0407, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0421, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0488, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0388, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0609, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0446, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0528, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0670, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0410, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0421, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0416, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0521, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0413, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0642, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0379, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0387, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0476, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0418, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0532, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0467, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0413, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0019, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0755, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0477, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0604, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0405, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0407, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0393, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0392, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0387, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0433, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0440, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0403, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0418, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0510, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0443, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0456, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0676, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0732, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0584, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0433, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0654, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0382, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0637, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0399, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0494, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0550, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0404, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0417, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0626, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0492, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0403, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0437, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0431, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0387, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0741, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0410, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0589, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0475, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0403, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0604, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0431, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0416, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0433, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0449, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0022, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0387, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0383, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0485, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0522, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0512, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0575, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0454, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0466, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0410, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0427, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0421, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0847, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0417, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0459, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0549, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0551, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0429, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0549, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0570, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0391, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0465, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0469, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0397, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0020, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0434, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0466, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0451, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0391, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1016, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0570, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0454, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0452, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0452, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0422, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0385, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0439, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0521, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0534, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0409, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0019, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0504, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0499, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0440, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0673, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0406, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0700, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0878, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0465, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0585, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0498, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0632, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0617, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0498, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0410, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0431, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0383, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0502, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0588, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0414, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0752, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0383, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0382, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0019, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0710, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0395, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0621, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0572, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0407, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0420, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0844, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0713, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0434, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0697, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0418, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0384, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1004, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0471, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0508, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0386, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0386, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0508, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0446, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0425, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0429, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0755, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0436, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0417, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0412, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0432, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0413, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0386, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0473, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0383, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0454, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0401, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0520, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0547, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0398, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0460, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0442, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0400, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0563, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0445, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0456, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0541, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0392, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0453, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0383, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0546, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0402, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0010, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0702, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0408, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0546, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0423, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0384, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0500, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0405, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0573, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0004, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0540, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0414, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0422, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0675, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0495, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0625, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0395, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0595, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0401, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0475, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0409, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0492, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0391, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0447, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0384, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0583, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0018, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0463, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0459, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0402, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0432, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0397, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0601, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0565, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0404, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0540, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0427, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0447, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0469, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0573, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0435, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0405, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0494, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0414, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0438, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0531, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0536, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0548, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0419, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0396, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0415, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0525, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0496, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0404, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0582, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0411, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0428, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0578, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0601, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0510, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0535, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0420, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0430, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0439, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0651, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0432, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0754, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0399, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0554, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0456, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0383, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0409, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0598, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0484, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0480, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0384, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0443, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0396, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0985, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0392, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0025, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0400, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0415, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0482, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0404, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0523, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0441, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0395, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0704, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0400, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0438, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0653, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0497, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0402, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0457, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0379, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0682, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0427, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0479, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0419, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0411, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0643, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0444, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0397, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0396, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0507, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0480, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0433, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0482, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0597, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0438, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0452, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0563, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0434, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0423, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0454, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0385, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0412, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0454, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0420, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0386, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0397, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0504, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0422, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0427, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0600, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0420, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0634, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0419, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0548, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0411, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0486, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0471, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0452, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0401, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0575, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0406, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0468, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0599, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0576, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0394, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0450, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0463, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0415, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0021, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0024, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0385, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0397, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0542, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0455, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0018, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0506, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0470, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0469, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0459, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0390, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0398, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0485, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0426, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0383, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0447, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0391, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0402, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0732, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0478, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0423, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0437, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0412, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0408, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0404, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0529, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0421, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0450, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0468, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0624, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0584, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0603, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0386, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0398, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0387, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0407, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0438, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0411, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0391, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0385, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0405, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0420, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0513, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0511, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0462, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0405, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0438, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0418, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0396, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0515, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0492, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0608, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0432, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0404, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0482, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0449, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0509, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0379, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0484, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0507, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0445, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0494, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0404, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0472, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0392, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0429, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0385, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0559, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0387, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0923, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0383, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0637, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0395, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0419, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0429, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0481, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0385, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0405, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0401, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0541, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0477, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0550, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0408, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0533, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0462, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0581, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0427, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0485, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0542, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0403, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0425, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0404, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0558, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0546, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0610, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0571, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0657, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0445, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0468, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0393, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0390, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0408, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0401, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0412, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0624, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0461, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0568, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0406, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0391, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0444, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0012, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0401, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0517, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0382, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0721, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0436, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0383, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0409, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0465, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0415, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0396, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0554, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0620, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0496, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0592, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0636, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0428, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0388, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0401, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0486, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0503, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0452, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0448, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0482, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0394, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0489, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0528, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0402, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0393, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0429, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0400, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0474, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0450, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0394, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0444, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0420, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0432, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0424, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0429, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0404, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0425, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0414, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0398, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0442, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0428, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0660, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0442, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0536, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0391, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0406, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0445, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0401, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0390, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0486, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0443, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0495, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0435, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0469, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0395, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0423, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0552, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0466, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0425, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0390, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0429, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0463, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0488, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0389, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0457, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0402, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0388, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0400, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0479, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0469, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0385, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0432, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0390, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0493, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0423, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0427, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0436, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0422, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0397, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0460, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0435, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0388, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0470, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0534, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0021, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0440, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0405, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0448, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0409, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0390, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0391, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0473, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0400, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0390, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0438, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0534, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0429, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0387, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0434, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0440, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0423, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0471, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0532, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0403, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0542, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0388, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0410, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0656, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0439, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0393, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0467, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0443, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0390, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0407, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0435, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0387, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0482, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0621, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0430, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0531, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0384, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0397, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0485, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0455, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0428, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0478, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0800, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0401, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0414, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0430, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0390, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0424, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0390, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0432, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0693, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0385, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0397, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0448, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0630, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0384, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0445, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0430, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0421, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0388, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0430, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0395, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0418, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0549, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0391, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0633, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0430, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0389, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0480, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0496, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0603, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0433, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0451, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0426, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0392, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0796, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0458, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0595, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0436, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0476, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0391, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0402, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0400, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0385, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0004, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0379, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0447, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0415, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0423, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0387, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0575, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0437, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0385, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0629, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0447, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0488, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0544, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0598, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0540, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0411, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0487, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0726, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0478, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0009, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0458, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0444, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0655, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0575, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0505, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0549, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0444, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0397, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0514, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0404, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0409, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0424, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0447, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0504, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0739, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0427, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0436, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0396, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0722, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0516, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0387, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0429, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0661, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0469, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0473, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0391, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0387, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0391, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0551, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0022, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0420, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0395, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0444, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0425, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0379, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0475, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0594, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0393, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0590, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0385, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0604, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0429, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0603, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0472, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0461, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0411, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0505, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0387, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0532, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0455, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0680, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0662, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0436, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0396, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0379, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0402, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0420, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0422, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0542, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0548, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0594, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0444, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0435, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0416, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0405, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0652, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0420, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0542, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0635, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0016, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0383, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0581, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0419, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0014, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0406, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0559, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0494, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0442, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0436, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0436, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0434, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0398, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0665, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0419, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0023, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0015, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0773, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0390, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0437, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0413, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0486, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0504, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0397, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0479, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0442, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0447, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0490, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0413, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0474, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0406, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0500, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0025, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0429, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0419, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0411, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0387, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0512, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0379, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0459, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0019, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0538, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0438, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0727, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0440, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0454, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0470, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0520, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0489, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0434, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0489, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0822, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0428, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0455, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0591, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0613, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0450, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0542, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0397, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0449, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0005, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0385, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0404, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0828, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0424, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0393, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0456, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0379, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0607, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0390, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0444, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0475, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0605, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0529, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0465, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0403, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0018, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0485, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0489, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0443, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0453, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0408, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0432, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0442, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0485, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0455, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0422, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0389, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0391, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0498, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0402, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0528, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0394, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0574, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0418, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0400, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0459, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0516, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0417, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0484, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0552, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0503, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0479, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0434, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0452, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0509, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0412, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0684, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0409, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0423, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0679, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0427, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0447, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0017, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0511, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0597, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0407, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0568, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0024, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0391, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0465, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0447, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0420, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0608, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0011, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0500, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0497, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0576, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0460, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0474, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0384, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0392, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0499, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0435, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0442, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0452, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0530, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0961, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0563, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0389, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0391, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0382, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0899, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0406, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0453, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0581, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0401, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0405, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0824, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0457, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0452, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0462, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0406, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0382, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0394, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0436, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0444, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0535, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0412, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0449, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0406, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0448, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0463, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0640, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0579, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0418, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0406, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0426, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0618, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0547, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0391, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0455, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0520, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0432, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0383, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0448, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0420, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0543, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0458, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0501, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0555, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0469, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0383, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0435, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0579, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0401, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0449, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0401, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0427, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0413, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0406, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0473, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0425, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0406, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0386, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0415, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0439, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0382, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0410, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0442, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0393, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0593, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0535, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0643, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0463, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0391, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0436, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0390, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0462, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0458, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0383, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0410, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0421, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0619, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0438, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0403, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0019, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0451, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0579, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0397, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0415, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0395, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0575, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0390, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0452, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0382, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0697, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0455, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0629, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0388, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0603, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0452, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0466, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0423, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0390, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0445, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0403, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0656, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0538, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0465, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0488, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0486, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0420, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0382, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0529, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0540, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0418, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0416, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0455, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0379, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0425, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0506, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0543, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0386, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0436, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0483, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0521, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0596, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0409, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0401, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0401, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0389, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0389, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0405, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0392, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0390, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0395, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0410, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0945, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0478, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0398, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0473, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0391, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0023, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0512, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0465, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0456, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0025, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0417, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0584, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0424, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0022, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0430, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0397, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0490, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0737, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0408, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0477, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0477, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0398, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0019, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0707, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0692, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0464, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0411, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0525, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0751, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0655, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0550, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0630, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0582, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0454, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0428, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0425, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0007, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0003, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0400, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0433, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0514, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0425, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0468, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0407, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0475, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0466, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0417, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0413, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0424, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0426, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0529, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0383, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0398, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0513, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0018, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0481, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0392, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0473, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0669, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0820, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0007, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0484, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0388, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0476, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0389, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0438, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0428, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0569, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0387, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0465, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0453, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0546, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0437, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0445, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0407, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0621, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0502, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0435, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0496, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0413, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0586, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0435, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0482, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0590, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0465, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0423, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0416, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0468, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0392, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0452, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0416, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0437, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0478, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0024, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0473, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0479, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0506, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0496, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0521, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0440, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0445, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0833, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0463, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0413, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0399, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0379, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0388, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0379, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0486, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0564, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0616, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0445, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0437, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0454, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0389, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0429, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0025, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0403, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0439, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0709, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0475, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0434, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0409, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0391, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0382, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0383, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0520, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0426, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0023, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0388, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(1.7616e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0383, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0436, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0881, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0518, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0421, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0574, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0470, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0458, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0422, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0395, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0825, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0394, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0385, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0513, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0725, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0393, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0025, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0388, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0022, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0528, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0421, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0518, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0490, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0423, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0437, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0475, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0414, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0388, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0507, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0414, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0402, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0413, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0479, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0402, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0436, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0556, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0457, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0458, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0544, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0488, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0473, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0394, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0427, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0379, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0544, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0527, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0443, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0432, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0392, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0726, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0401, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0437, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0392, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0568, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0419, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0393, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0017, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0018, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0002, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0012, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0012, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0016, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0617, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0575, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0001, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0024, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0541, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0400, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0016, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0389, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0392, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0557, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0414, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0385, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0491, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0457, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0443, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0392, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0388, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0448, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0450, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0384, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0004, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0391, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0469, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0391, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0500, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0468, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0577, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0499, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0458, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0387, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0469, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0386, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0417, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0432, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0450, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0434, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0400, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0471, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0462, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0402, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(1.6462e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0400, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0389, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0491, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0388, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0554, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0484, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0407, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0437, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0392, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0466, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0414, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0396, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0469, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0456, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0400, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0626, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0493, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0478, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0570, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(3.1747e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0001, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0024, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0017, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0018, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0004, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(1.1819e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0481, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0430, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0484, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0494, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0553, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0389, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0018, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0401, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0003, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0018, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0405, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0549, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0467, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0874, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0412, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0433, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0400, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0022, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0757, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0414, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0402, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0396, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0563, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0544, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0399, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0437, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0441, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0553, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0459, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0402, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0529, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0637, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0025, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0404, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0379, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0384, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0390, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0418, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0439, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0020, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0478, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0562, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0410, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0471, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0538, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0023, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0435, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0487, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0506, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0387, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0386, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0621, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0022, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0586, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0476, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0433, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0414, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0419, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0505, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0765, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0516, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0725, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0433, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0391, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(9.1243e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0416, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0015, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0003, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(7.2886e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0401, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0403, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0019, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0525, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0429, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0606, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0696, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0509, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0413, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0807, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0536, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0005, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0022, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0016, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0019, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(2.3643e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0407, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0006, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0478, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0017, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0006, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0024, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0001, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0402, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0565, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0014, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0449, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0005, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0793, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0018, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0438, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0411, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0805, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0420, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0507, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0419, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0379, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0526, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0410, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0762, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0445, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(2.7622e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0584, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0591, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0415, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0383, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0562, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0488, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0382, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0514, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0431, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0478, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0397, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0424, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0533, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0590, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0393, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0023, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0885, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0420, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0545, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0011, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0597, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0414, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0539, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0420, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0522, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0013, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0501, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0018, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0013, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0004, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0002, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0009, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(1.0622e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0013, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0414, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0418, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0003, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0024, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(1.3393e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0008, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0445, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(3.1417e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0006, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0008, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(1.6932e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0023, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0003, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0002, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0024, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0008, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(1.3090e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0006, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0005, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0024, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0023, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0002, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0489, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0020, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(3.7340e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(1.0309e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(1.0309e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(1.0304e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0025, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0569, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0426, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0526, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(1.9712e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0382, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0593, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0620, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0435, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0411, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0481, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0577, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0412, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0418, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0741, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0411, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0397, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0403, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0421, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0458, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0518, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0632, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0429, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0478, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0407, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0406, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0484, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0486, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0387, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0519, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0385, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0385, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0025, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0459, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(1.9571e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0021, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0012, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0023, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0003, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(3.3875e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0001, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0005, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0002, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(1.6919e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0012, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0013, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0015, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0006, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(1.1448e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(5.0613e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0015, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0016, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0017, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0010, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0005, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(5.2397e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0013, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(1.3870e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(3.3872e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(5.9121e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0014, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0002, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0020, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0398, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0025, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0010, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0013, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0017, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0015, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0003, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0013, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0015, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0022, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(2.2637e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0016, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0015, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(1.7275e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(9.6099e-06, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0003, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(1.2281e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(1.1046e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0017, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(1.3531e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(4.2705e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0010, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0006, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(9.6530e-06, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0563, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0385, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0385, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0702, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0391, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0006, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0006, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0006, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0006, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0006, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0006, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0006, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0005, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0005, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0005, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0005, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0005, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0005, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0396, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0433, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0004, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0563, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0022, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0499, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0529, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0389, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0594, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0417, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0385, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0737, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0483, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0497, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0025, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0733, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0416, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0408, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0412, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0406, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0008, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0010, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0023, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0024, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0424, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0422, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0394, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(3.6234e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0008, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0020, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0006, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0531, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0004, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0005, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0022, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0008, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0388, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0007, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0010, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0010, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0008, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0518, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0015, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0468, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0002, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0021, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0020, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0007, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0007, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0008, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0423, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0711, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0422, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0509, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0014, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(8.0695e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0436, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0398, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0413, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0025, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0528, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0550, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0394, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0422, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0007, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0411, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0020, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0465, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0485, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0557, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0451, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0636, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0004, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0008, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(1.1578e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0007, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0002, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(8.1161e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(9.1204e-06, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0012, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0025, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0021, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0025, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0022, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0390, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0015, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(6.3052e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0015, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0416, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0015, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0010, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0008, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0025, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0636, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0022, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0511, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0443, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0477, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0386, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0015, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0008, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0444, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0403, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0411, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0687, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0421, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0520, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0516, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0465, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0386, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0477, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0465, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0658, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0428, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0387, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0389, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0379, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0555, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0556, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0418, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0014, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0668, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0433, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0525, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0613, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0384, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0491, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0552, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0015, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0004, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0007, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0020, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0473, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0464, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0438, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0396, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0648, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0436, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0023, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0444, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0022, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0413, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0430, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0503, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0458, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0602, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0610, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0470, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0439, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0385, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0404, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0425, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0581, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0698, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0457, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0406, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0392, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0471, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0385, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0446, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0457, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0521, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0463, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0467, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0480, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0394, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0414, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0901, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0716, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0408, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0485, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0535, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0635, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0402, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0384, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0442, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0464, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0477, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0385, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0465, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0670, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0616, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0501, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0379, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0437, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0489, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0584, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0463, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0379, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0398, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0406, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0470, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0500, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0423, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0574, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0544, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0435, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0520, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0425, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0432, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0420, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0440, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0427, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0518, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0394, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0589, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0560, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0805, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0422, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0528, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0014, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0447, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0386, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0407, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0427, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0411, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0434, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0424, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0527, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0466, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0460, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0419, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0412, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0423, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0387, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0598, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0015, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0448, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0396, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0441, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0411, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0410, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0401, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0445, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0461, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0652, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0489, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0442, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0470, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0412, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0496, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0398, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0403, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0444, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0412, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0603, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0418, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0431, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0400, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0477, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0401, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0430, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0412, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0389, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0491, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0441, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0418, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0477, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0394, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0463, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0403, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0383, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0468, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0525, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0385, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0534, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0462, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0423, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0433, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0427, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0606, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0413, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0461, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0633, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0405, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0565, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0382, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0486, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0390, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0386, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0396, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0508, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0486, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0458, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0441, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0388, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0387, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0423, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0425, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0390, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0444, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0603, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0382, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0439, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0500, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0473, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0423, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0450, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0459, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0564, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0399, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0400, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0391, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0419, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0402, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0493, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0403, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0397, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0501, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0750, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0428, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0522, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0551, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0478, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0383, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0403, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0443, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0463, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0704, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0490, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0494, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0465, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0546, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0419, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0622, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0583, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0538, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0604, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0563, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0574, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0483, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0458, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0429, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0621, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0554, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0663, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0406, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0526, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0715, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0379, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0610, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0448, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0386, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0023, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0582, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0395, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0463, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0585, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0466, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0485, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0511, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0405, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0516, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0397, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0392, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0398, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0430, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0515, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0505, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0465, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0419, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0588, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0706, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0012, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0485, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0468, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0443, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0410, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0420, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0403, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0402, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0491, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0492, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0841, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0422, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0446, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0383, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0567, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0013, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0542, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0387, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0729, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0451, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0501, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0446, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0394, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0439, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0020, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0431, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0477, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0379, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0487, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0490, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0453, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0386, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0432, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0452, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0438, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0572, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0545, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0397, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0386, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0498, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0016, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0405, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0540, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0413, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0414, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0418, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0491, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0404, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0396, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0439, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0500, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0385, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0006, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0610, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0009, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1006, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0390, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0456, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0528, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0434, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0021, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0389, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0426, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0383, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0396, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0435, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0405, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0467, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0409, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0405, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0523, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0455, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0426, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0405, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0393, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0422, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0383, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0486, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0406, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0422, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0395, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0681, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0024, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0501, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0456, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0510, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0559, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0572, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0008, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0421, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0019, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0632, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0508, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0389, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0408, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0432, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0386, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0520, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0393, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0433, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0415, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0476, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0497, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0450, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0445, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0490, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0423, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0432, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0454, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0422, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0506, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0451, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0388, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0510, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0401, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0665, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0439, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0473, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0404, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0470, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0478, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0416, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0411, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0491, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0397, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0420, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0422, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0424, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0389, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0406, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0414, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0394, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0014, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(7.8883e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0420, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0632, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0434, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0387, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0449, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0402, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0423, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0010, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0489, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0462, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0415, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0471, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0505, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0509, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0602, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0438, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0514, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0443, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0406, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0379, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0410, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0619, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0410, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0536, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0417, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0442, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0407, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0405, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0396, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0465, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0425, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0021, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0406, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0434, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0433, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0400, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0566, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0439, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0426, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0379, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0396, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0465, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0420, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0694, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0664, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0424, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0517, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0383, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0417, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0594, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0600, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0015, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0575, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0466, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0428, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0483, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0407, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0451, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0418, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0470, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0382, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0007, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0402, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0502, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0463, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0502, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0407, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0020, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0404, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0466, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0724, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0379, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0599, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0398, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0014, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0486, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0426, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0445, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0417, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0572, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0394, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0397, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0382, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0379, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0417, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0404, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0472, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0015, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0536, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0400, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0017, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0553, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0424, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0440, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0451, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0511, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0756, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0500, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0440, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0412, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0018, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0425, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0508, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0577, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0469, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0499, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0434, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0016, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0413, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0478, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0440, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0412, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0466, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0513, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0523, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0462, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0022, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0490, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0426, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0425, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0385, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0024, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0482, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0399, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0458, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0018, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0572, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0584, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0406, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0787, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0384, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0407, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0397, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0389, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0425, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0443, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0492, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0409, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0476, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0422, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0493, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0420, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0444, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0433, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0464, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0466, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0570, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0393, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0012, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0469, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0585, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0402, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0395, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0670, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0395, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0491, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0470, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0441, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0550, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0386, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0442, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0024, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0393, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0436, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0386, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0412, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0386, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0466, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0433, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0403, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0416, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0412, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0407, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0419, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0411, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0467, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0379, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0454, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0743, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0451, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0498, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0437, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0449, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0565, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0425, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0471, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0400, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0416, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0415, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0390, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0401, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0438, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0473, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0011, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0384, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0389, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0615, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0594, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0017, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0607, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0492, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0468, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0457, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0420, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0474, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0022, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0432, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0443, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0493, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0389, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0667, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0734, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0411, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0404, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0497, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0496, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0423, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0438, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0416, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0386, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0654, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0406, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0634, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0415, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0467, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0406, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0010, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0496, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0411, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0013, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0023, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0025, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0386, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0392, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0405, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0598, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0011, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0437, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0455, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0459, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0017, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0449, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0420, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0473, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0463, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0425, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0443, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0444, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0449, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0527, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0392, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0014, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0430, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0428, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0537, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0389, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0619, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0402, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0396, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0405, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0412, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0444, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0586, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0452, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0012, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0500, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0428, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0016, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0388, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0727, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0009, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0515, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0505, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0016, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0390, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0473, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0404, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0427, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0399, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0461, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0486, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0016, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0006, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0526, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0488, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0411, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0439, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0634, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0410, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0459, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0457, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0420, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0548, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0499, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0474, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0412, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0406, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0021, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0012, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0406, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0466, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0010, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0432, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0408, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0504, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0426, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0011, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0441, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0443, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0004, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0399, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0515, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0020, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0014, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0473, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0022, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0012, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0472, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(2.2432e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(5.9793e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0446, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0004, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0011, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0023, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0493, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0420, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0565, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0409, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0486, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0009, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0463, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0013, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0005, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0020, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0432, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0387, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0439, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0471, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0020, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0451, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0021, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0001, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0004, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0009, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0021, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0021, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0421, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0430, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0583, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0571, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0704, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0425, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0002, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0018, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0530, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0410, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0018, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0515, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0020, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0476, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0013, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0013, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0007, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0013, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0739, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0401, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0402, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0014, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0396, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0009, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0006, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(3.8549e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(4.5026e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0005, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(3.0287e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(2.7342e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(2.7251e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0007, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0010, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0009, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(2.7238e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0005, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(3.2547e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(2.7930e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0007, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0015, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(3.5207e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0005, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(3.5543e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0002, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(2.9264e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(4.4017e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(2.8727e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0003, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(4.1144e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(3.0486e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(4.2931e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(3.4025e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(4.5316e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(2.7313e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(3.4535e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(2.7254e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(4.1459e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0005, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(2.6748e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(2.6201e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(3.1155e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(2.8440e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0002, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(2.8225e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(2.7754e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(2.8874e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(3.2638e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(2.7685e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(2.8979e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(2.6056e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(2.6490e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(2.5672e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0004, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(4.4939e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(4.2596e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0004, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(3.7885e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(2.5291e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(3.5644e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0011, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(3.0485e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(8.6353e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(2.8368e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(3.2132e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0025, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0006, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(2.8653e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(2.4766e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(2.4720e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(2.7086e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(3.5822e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0018, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(2.4384e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(2.9461e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(8.6086e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0010, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(2.8115e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0008, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(3.2011e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(2.4686e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(2.6475e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(3.3277e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(2.6566e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(2.7635e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(2.8408e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0003, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0005, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(4.0901e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(2.4852e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(4.8907e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0005, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(2.3407e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(3.1747e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0003, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0005, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0005, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0013, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(2.7064e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0003, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0003, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(2.4030e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0003, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0001, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0003, device='cuda:0', grad_fn=<NllLossBackward>)
--------------EPOCH 5-------------
Test Accuracy: tensor(0.4344, device='cuda:0')
Loss: tensor(0.0998, device='cuda:0', grad_fn=<NllLossBackward>)
Saving model at iteration: 0
Loss: tensor(0.0490, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0448, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0623, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0590, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0460, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0593, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0534, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0559, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0494, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0700, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0465, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0453, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0449, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0920, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0798, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0458, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0950, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0410, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0541, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0383, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0520, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0609, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0747, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0402, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0516, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0579, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0390, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0383, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0788, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0573, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0451, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0531, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0532, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0417, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1017, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0552, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0405, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0473, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0442, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0487, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0475, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0487, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0475, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0449, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0464, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0524, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0826, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0458, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0521, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0831, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0456, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0901, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0505, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0602, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0447, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0522, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0504, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0704, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0637, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0562, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0736, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0491, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0574, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0436, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0783, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0441, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0730, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0667, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0483, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0463, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0518, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0575, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0393, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0457, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0553, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0670, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0450, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0517, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0518, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0477, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0431, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0413, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0634, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0610, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0746, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0471, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0390, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0539, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0444, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0937, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0611, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0666, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0492, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0509, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0469, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0839, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0436, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0406, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0447, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0382, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0648, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0563, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0764, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0633, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0797, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0565, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0389, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0721, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0798, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0462, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0674, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0499, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0570, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0498, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0528, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0398, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0573, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0512, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0760, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0497, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0416, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0717, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0484, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0734, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0419, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0541, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0501, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0453, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0488, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0653, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0433, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0435, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0418, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0442, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0466, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0554, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0498, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0405, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0581, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0621, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0488, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0720, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0590, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0522, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0598, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0459, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0432, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0608, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0723, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0434, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0546, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0534, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0511, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0407, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0631, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0398, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0420, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0491, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0816, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0725, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0535, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0599, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0701, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0780, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0591, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0675, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0649, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0382, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0592, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0473, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0471, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0418, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0547, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0841, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0683, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0483, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0522, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0482, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0679, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0799, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0406, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0449, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0591, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0568, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0692, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0458, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0612, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0453, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0544, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0665, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0503, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0461, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0669, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0674, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0595, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0532, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0596, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0508, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0552, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0406, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0421, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0382, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0696, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0544, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0466, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0467, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0608, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0644, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0608, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0423, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0546, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0482, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0395, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0490, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0494, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0450, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0485, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0545, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0623, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0504, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0674, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0500, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0521, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0474, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0564, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0399, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0421, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0435, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0498, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0492, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0523, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0464, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0572, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0432, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0388, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0785, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0505, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0457, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0404, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0412, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0426, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0429, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0516, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0514, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0634, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0390, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0518, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1022, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0622, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0466, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0394, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0538, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0472, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0483, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0552, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0403, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0497, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0631, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0733, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0741, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0579, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0407, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0478, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0427, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0773, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0527, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0391, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0785, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0454, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0426, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0494, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0556, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0451, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0679, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0628, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0606, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0460, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0387, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0499, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0428, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0386, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0406, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0399, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0623, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0394, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0674, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0427, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0575, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0464, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0436, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0426, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0530, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0547, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0713, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0632, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0671, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0421, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0571, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0867, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0389, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0498, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0457, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0510, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0434, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0406, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0514, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0443, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0504, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0412, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0396, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0632, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0550, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0715, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0493, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0455, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0450, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0418, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0761, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0488, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0950, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0820, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0714, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0388, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0862, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0604, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0479, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0575, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0566, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0439, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0389, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0470, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0479, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0492, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0607, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0408, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0503, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0513, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0667, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0778, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0518, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0575, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0506, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0450, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0542, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0473, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0827, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0508, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0461, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0385, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0482, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0705, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0416, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0505, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0482, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0614, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0618, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0650, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0743, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0512, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0413, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0558, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0617, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0392, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0659, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0498, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0649, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0399, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0412, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0507, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0460, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0493, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0464, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0530, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0382, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0859, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0415, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0431, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0677, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0417, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0447, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0733, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0387, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0629, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0607, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0642, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0516, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0673, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0618, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0802, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0713, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0562, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0437, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0444, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0608, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0514, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0499, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0479, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0393, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0451, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0568, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0866, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0518, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0531, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0529, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0640, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0388, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0696, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0507, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0672, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0415, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0679, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0470, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0405, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0479, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0486, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0610, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0418, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0441, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0676, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0481, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0638, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0536, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0384, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0457, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0431, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0523, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0497, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0394, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0807, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0508, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0517, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0397, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0516, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0538, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0656, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0440, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0737, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0648, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0404, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0389, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0524, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0651, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0608, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0736, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0438, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0830, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0752, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0479, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0516, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0454, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0592, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0738, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0546, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0655, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0443, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0756, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0507, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0472, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0694, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0471, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0504, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0593, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0417, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0642, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0632, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0835, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0453, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0708, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0538, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0769, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0728, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0514, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0475, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0533, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0547, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0766, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0392, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0512, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0589, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0433, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0670, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0547, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0471, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0524, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0553, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0439, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0554, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0457, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0484, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0416, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0612, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0451, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0518, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0432, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0426, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0664, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0600, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0873, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0743, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0438, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0572, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0487, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0570, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0548, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0383, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0591, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0648, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0591, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0430, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0439, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0496, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0418, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0413, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0604, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0605, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0832, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0522, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0514, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0538, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0592, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0536, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0509, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0606, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1000, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0714, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0391, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0555, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0582, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0533, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0589, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0431, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0580, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0521, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0443, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0506, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0568, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0422, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0440, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0393, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0506, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0672, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0608, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0577, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0689, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0611, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0590, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0471, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0403, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0458, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0485, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0513, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0411, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0657, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0608, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0591, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0685, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0545, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0587, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0616, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0627, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0520, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0905, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0696, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0530, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0559, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0502, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0510, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0478, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0402, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0466, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0796, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0450, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0671, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0578, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0401, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0507, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0752, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0618, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0411, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0555, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0927, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0683, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0453, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0586, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0533, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0581, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0854, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0536, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0601, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0951, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0387, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0539, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0634, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0393, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0431, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0533, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0642, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0805, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0504, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0671, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0725, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0445, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0503, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0512, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0856, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0575, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0637, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0614, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0820, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0597, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0687, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0390, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0751, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0519, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0527, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0443, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0548, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0566, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0495, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0639, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0400, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0519, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0595, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0859, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0530, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0616, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0752, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0536, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0601, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0494, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0421, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0397, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0737, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0516, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0415, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0483, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0570, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0693, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0395, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0751, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0458, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0996, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0549, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0911, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0432, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0796, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0398, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0544, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0493, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0451, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0925, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0383, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0440, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0543, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0455, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0695, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0460, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0485, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0463, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0459, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0734, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0674, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0620, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0465, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0433, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0448, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0591, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0755, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0477, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0587, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0386, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0471, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0681, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0511, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0672, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0415, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0649, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0905, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0535, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0497, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0600, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0591, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0761, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0577, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0442, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0705, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0716, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0641, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0391, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0430, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0619, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0497, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0435, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0429, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0571, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0582, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0656, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0525, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0626, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0483, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0427, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0427, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0410, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0776, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0530, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0451, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0596, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0634, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1025, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0554, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0690, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0394, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0410, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0473, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0519, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0426, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0648, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0500, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0568, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0575, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0828, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0565, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0403, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0530, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0434, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0544, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0437, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0460, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0557, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0481, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0397, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0452, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0516, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0391, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0698, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0487, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0563, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0387, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0521, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0506, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0613, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0589, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0705, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0551, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0454, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0693, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0662, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0560, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0737, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0416, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0437, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0466, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0578, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0496, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0487, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0434, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0530, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0484, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0388, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0642, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0406, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0554, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0404, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0404, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0503, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0449, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0905, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0618, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0654, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0450, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0524, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0684, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0448, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0544, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0448, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0605, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0522, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0483, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0393, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0473, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0488, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0521, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0701, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0458, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0430, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0383, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0498, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0386, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0511, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0468, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0411, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0396, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0524, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0622, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0744, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0600, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0497, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0660, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0390, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0429, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0549, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0501, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0832, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0549, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0567, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0439, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0815, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0567, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0445, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0421, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0713, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0495, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0753, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0543, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0671, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0595, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0515, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0670, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0618, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0425, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0659, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0711, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0801, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0449, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0539, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0491, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0585, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0750, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0592, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0707, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0400, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0546, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0776, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0562, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0413, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0407, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0486, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0491, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0723, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0527, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0631, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0817, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0591, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0646, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0442, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0470, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0568, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0554, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0422, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0524, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0500, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0527, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0714, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0588, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0443, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0465, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0800, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0475, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0732, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0721, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0411, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0499, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0732, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0807, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0488, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0584, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0429, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0756, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0702, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0679, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0449, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0590, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0391, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0448, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0608, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0459, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0572, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0392, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0534, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0831, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0409, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0446, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0626, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0441, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0448, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0497, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0526, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0422, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0500, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0599, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0667, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0696, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0599, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0568, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0719, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0509, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0518, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0385, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0599, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0817, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0547, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0395, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0417, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0568, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0483, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0464, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0507, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0586, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0492, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0696, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0460, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0418, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0526, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0462, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0632, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0428, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0600, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0486, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0415, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0490, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0447, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0394, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0564, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0721, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0393, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0593, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0391, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0755, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0551, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0551, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0861, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0508, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0597, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0456, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0664, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0645, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0609, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0615, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0484, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0658, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0463, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0482, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0473, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0695, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0665, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0386, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0484, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0551, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0460, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0600, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0516, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0487, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1010, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0389, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0459, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0488, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0462, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0687, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0515, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0407, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0500, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0382, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0679, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0563, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0410, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0577, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0481, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0409, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0405, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0601, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0558, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0498, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0452, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0472, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0437, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0465, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0610, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0531, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0519, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0424, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0409, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0573, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0627, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0502, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0652, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0470, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0425, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0498, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0726, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0488, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0567, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0503, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0482, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1025, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0478, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0458, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0706, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0431, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0422, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0447, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0626, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0593, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0573, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0619, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0467, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0564, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0597, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0638, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0392, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0581, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0581, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0684, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0472, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0554, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0604, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0634, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0645, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0436, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0909, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0544, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0530, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0510, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0412, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0520, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0536, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0813, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0398, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0559, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0510, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0635, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0584, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0673, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0748, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0392, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0405, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0547, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0606, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0516, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0488, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0643, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0463, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0416, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0511, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0504, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0474, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0460, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0428, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0713, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0668, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0502, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0561, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0479, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0629, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0514, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0609, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0513, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0702, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0501, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0654, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0622, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0404, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0764, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0856, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0723, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0416, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0437, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0390, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0675, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0673, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0424, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0514, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0759, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0436, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0414, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0411, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0489, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0667, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0678, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0578, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0426, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0592, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0491, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0401, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0387, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0481, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0558, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0480, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0542, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1022, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0438, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0496, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0433, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0494, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0556, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0705, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0413, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0442, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0498, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0512, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0408, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0493, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0449, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0545, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0421, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0746, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0623, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0413, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0620, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0506, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0453, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0459, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0853, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0878, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0538, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0407, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0411, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0439, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0796, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0868, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0623, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0775, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0447, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0549, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0503, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0621, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0765, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0490, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0933, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0450, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0487, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0498, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0637, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0519, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0531, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0487, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0662, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0428, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0800, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0710, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0630, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0534, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0787, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0384, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0580, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0600, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0464, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0712, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0407, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0588, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0580, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0398, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0618, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0516, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0518, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0662, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0547, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0734, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0402, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0554, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0468, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0392, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0482, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0796, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0463, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0413, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0665, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0551, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0530, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0447, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0764, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0540, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0481, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0513, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0502, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0603, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0537, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0416, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0808, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0446, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0844, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0467, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0467, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0432, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0525, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0622, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0422, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0844, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0393, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0507, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0480, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0382, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0626, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0388, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0440, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0503, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0405, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0442, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0483, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0488, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0577, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0570, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0653, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0792, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0402, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0582, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0784, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0482, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0404, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0382, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0470, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0485, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0534, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0974, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0450, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0467, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0443, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0446, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0457, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0574, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0495, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0467, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0630, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0649, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0530, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0410, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0659, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0485, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0554, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0567, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0509, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0619, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0505, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0813, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0475, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0592, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0641, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0706, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0685, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0630, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0446, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0753, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0499, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0598, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0757, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0423, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0854, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0522, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0627, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0548, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0404, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0396, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0418, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0645, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0379, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0414, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0502, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0526, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0459, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0591, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0648, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0392, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0656, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0595, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0581, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0386, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0613, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0415, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0712, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0393, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0477, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0420, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0570, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0404, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0497, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0430, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0409, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0546, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0634, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0783, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0432, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0555, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0608, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0596, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0428, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0618, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0394, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0584, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0442, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0663, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0494, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0532, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0554, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0630, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0402, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0821, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0672, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0695, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0488, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0444, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0935, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0418, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0522, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0474, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0465, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0536, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0646, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0494, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0444, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0645, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0428, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0536, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0596, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0468, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0705, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0897, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0653, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0410, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0494, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0424, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0390, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0431, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0600, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0407, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0560, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0685, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0617, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0551, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0736, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0518, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0474, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0493, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0797, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0474, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0515, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0755, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0419, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0732, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0453, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0505, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0489, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0478, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0514, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0528, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0429, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0592, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0859, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0401, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0606, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0661, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0450, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0496, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0723, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0584, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0434, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0606, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0423, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0438, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0621, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0384, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0771, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0436, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0546, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0408, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0642, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0480, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0629, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0854, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0532, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0616, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0586, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0431, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0691, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0591, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0554, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0660, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0412, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0432, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0410, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0723, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0422, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0473, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0847, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0460, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0757, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0476, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0664, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0479, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0513, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0499, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0543, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0821, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0467, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0430, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0662, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0392, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0479, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0390, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0530, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0726, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0645, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0732, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0669, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0466, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0463, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0699, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0650, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0414, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0403, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0558, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0542, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0604, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0478, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0519, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0424, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0422, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0613, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0649, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0537, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0585, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0555, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0470, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0674, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0511, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0472, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0434, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0844, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0570, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0585, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0409, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0423, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0719, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0943, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0419, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0584, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0539, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0510, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0390, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0614, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0607, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0683, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0478, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0561, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0511, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0881, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0382, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0643, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0514, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0627, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0802, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0494, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0534, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0480, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0590, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0425, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0586, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0515, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0598, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0743, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0441, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0435, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0417, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0576, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0681, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0469, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0458, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0438, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0520, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0508, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0516, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0686, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0480, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0463, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0442, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0505, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0711, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0647, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0391, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0541, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0652, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0418, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0906, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0569, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0768, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0477, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0618, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0531, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0419, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0433, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0599, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0391, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0445, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0577, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0382, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0456, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0539, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0454, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0544, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0868, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0573, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0595, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0421, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0431, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0439, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0574, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0436, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0617, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0452, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0488, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0845, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0388, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0580, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0480, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0687, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0803, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0716, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0517, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0589, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0760, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0395, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0667, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0616, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0711, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0379, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0698, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0534, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0492, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0530, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0398, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0673, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0504, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0782, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0704, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0422, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0615, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0530, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0499, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0451, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0526, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0723, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0420, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0386, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0412, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0383, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0605, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0458, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0630, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0715, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0662, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0438, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0537, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0589, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0454, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0461, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0485, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0533, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0507, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0592, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0442, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0407, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0497, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0567, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0585, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0455, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0406, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0936, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0880, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0476, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0389, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0489, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0707, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0763, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0514, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0635, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0672, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0653, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0788, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0547, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0518, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0590, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0473, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0459, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0482, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0684, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0525, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0448, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0544, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0635, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0445, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0752, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0631, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0513, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0445, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0432, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0600, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0626, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0598, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0464, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0422, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0429, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0702, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0474, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0585, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0923, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0625, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0446, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0593, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0494, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0533, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0520, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0607, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0640, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0503, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0672, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0492, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0394, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0528, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0396, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0662, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0522, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0507, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0549, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0801, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0833, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0568, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0589, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0384, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0600, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0444, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0397, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0664, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0486, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0476, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0458, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0417, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0447, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0697, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0465, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0773, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0713, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0414, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0389, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0757, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0441, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0468, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0542, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0474, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0566, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0647, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0563, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0451, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0555, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0580, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0573, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0522, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0769, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0479, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0403, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0411, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0450, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0383, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0553, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0519, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0754, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0580, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0515, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0461, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0467, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0405, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0543, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0610, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0569, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0417, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0535, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0394, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0498, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0424, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0941, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0598, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0834, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0527, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0885, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0396, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0460, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0691, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0385, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0453, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0415, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0399, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0515, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0400, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0549, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0427, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0534, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0488, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0382, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0568, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0630, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0506, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0397, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0626, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0703, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0386, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0543, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0433, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0983, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0738, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0481, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0388, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0721, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0685, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0660, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0760, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0430, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0504, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0568, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0599, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0667, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0623, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0442, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0615, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0532, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0417, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0421, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0464, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0451, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0391, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0441, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0438, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0718, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0458, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0419, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0448, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0592, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0522, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0423, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0416, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0379, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0838, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0501, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0510, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0758, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0399, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0515, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0639, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0519, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0457, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0397, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0449, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0406, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0617, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0646, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0444, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0672, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0498, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0446, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0517, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0385, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0537, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0770, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0638, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0446, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0539, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0668, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0493, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0457, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0476, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0874, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0445, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0473, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0732, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0825, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0856, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0408, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0491, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0844, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0411, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0481, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0638, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0449, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0569, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0647, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0425, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0698, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0538, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0563, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0379, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0393, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0504, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0633, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0460, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0744, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0398, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0751, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0453, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0466, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0474, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0690, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0486, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0550, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0488, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0588, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0435, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0442, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0532, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0591, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0548, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0579, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0424, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0443, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0414, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0448, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0490, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0554, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0622, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0427, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0765, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0568, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0438, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0498, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0554, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0860, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0491, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0527, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0709, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0478, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0984, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0877, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0631, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0729, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0419, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0485, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0733, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0606, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0686, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0720, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0398, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0513, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0629, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0557, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0631, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0393, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0573, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0515, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0712, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0590, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0542, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0725, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0441, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0536, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0499, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0506, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0563, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0680, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0635, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0458, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0562, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0418, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0719, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0394, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0774, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0554, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0773, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0603, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0780, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0435, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0709, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0492, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0482, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0630, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0719, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0572, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0405, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0473, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0448, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0619, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0448, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0454, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0459, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0485, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0390, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0637, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0589, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0481, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0620, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0838, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0439, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0497, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0474, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0874, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0609, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0584, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0599, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0489, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0611, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0609, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0514, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0471, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0752, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0732, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0410, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0500, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0571, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0559, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0620, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0449, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0474, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0722, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0420, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0396, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0728, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0512, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0491, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0421, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0568, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0460, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0538, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0400, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0615, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0506, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0460, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0775, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0534, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0753, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0557, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0486, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0456, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0388, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0391, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0499, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0469, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0536, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0447, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0456, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0455, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0692, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0469, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0848, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0561, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0388, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0621, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0410, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0535, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0435, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0424, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0845, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0519, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0604, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0448, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0398, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0613, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0756, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0464, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0448, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0725, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0520, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0715, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0699, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0596, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0527, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0797, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0615, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0650, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0484, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0787, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0394, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0590, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0578, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0543, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0664, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0500, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0507, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0390, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0402, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0589, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0727, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0773, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0466, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0420, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0588, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0531, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0634, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0418, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0463, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0443, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0606, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0595, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0842, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0472, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0587, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0819, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0710, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0434, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0435, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0623, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0436, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0428, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0459, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0386, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0391, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0591, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0548, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0907, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0659, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0561, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0607, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0550, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0488, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0406, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0539, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0395, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0518, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0484, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0617, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0422, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0533, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0428, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0422, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0402, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0575, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0648, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0517, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0547, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0545, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0609, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0397, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0518, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0445, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0588, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0544, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0418, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0408, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0597, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0481, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0389, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0491, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0578, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0790, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0727, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0398, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0407, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0456, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0770, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0530, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0670, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0448, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0627, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0445, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0931, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0413, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0599, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0497, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0822, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0487, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0635, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0535, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0382, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0608, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0500, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0395, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0501, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0507, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0583, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0560, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0454, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0636, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0496, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0414, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0530, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0451, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0629, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0448, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0399, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0405, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0687, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0685, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0427, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0451, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0854, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0755, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0468, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0464, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0379, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0455, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0682, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0585, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0726, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0625, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0404, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0457, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0589, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0501, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0512, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0762, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0477, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0404, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0433, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0461, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0526, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0397, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0428, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0585, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0677, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0443, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0654, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0423, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0531, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0495, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0438, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0397, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0401, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0583, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0662, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0536, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0508, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0529, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0576, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0398, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0527, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0469, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0666, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0696, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0408, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0717, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0627, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0489, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0796, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0490, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0451, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0484, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0419, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0611, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0513, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0577, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0960, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0502, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0394, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0482, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0543, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0590, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0428, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0548, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0492, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0567, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0556, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0460, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0396, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0611, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0596, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0403, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0517, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0814, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0475, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0574, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0664, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0825, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0465, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0528, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0464, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0696, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0387, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0510, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0510, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0449, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0623, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0685, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0713, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0470, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0505, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0507, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0414, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0417, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0564, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0782, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0434, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0535, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0454, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0685, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0445, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0860, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0460, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0750, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0530, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0548, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0648, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0407, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0645, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0427, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0606, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0469, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0479, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0498, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0802, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0513, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0379, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0504, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0449, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0678, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0546, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0729, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0426, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0661, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0428, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0424, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0494, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0408, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0418, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0408, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0590, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0657, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0442, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0678, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0411, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0592, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0584, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0498, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0556, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0445, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0705, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0525, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0551, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0439, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0647, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0449, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0627, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0379, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0399, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0514, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0694, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0589, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0417, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0743, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0577, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0623, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0462, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0403, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0430, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0450, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0511, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0712, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0396, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0644, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0603, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0396, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0576, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0579, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0413, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0504, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0443, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0450, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0400, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0437, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0774, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0517, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0515, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0406, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0441, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0467, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0549, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0391, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0520, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0396, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0713, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0484, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0461, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0617, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0479, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0525, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0538, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0797, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0391, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0435, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0580, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0450, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0582, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0439, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0559, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0407, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0542, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0694, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0423, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0565, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0471, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0409, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0447, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0769, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0704, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0644, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0468, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0731, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0485, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0516, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0491, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0460, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0478, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0472, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0673, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0668, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0428, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0882, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0541, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0385, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0496, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0450, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0737, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0565, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0431, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0684, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0768, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0516, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0459, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0449, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0491, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0569, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0623, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0492, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0577, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0604, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0669, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0635, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0392, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0462, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0405, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0435, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0548, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0396, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0690, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0385, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0410, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0425, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0594, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0621, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0510, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0594, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0555, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0860, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0411, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0641, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0426, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0621, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0472, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0444, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0583, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0505, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0417, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0543, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0671, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0639, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0621, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0493, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0444, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0530, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0415, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0740, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0598, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0545, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0572, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0384, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0791, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0519, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0405, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0513, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0471, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0499, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0445, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0697, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0532, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0719, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0760, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0467, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0523, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0524, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0436, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0379, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0575, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0493, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0447, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0821, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0406, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0607, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0699, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0476, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0489, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0722, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0692, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0395, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0480, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0463, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0988, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0461, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0457, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0700, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0403, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0764, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0536, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0464, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0552, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0583, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0743, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0412, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0621, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0484, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1006, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0457, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0395, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0379, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0379, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0829, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0472, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0527, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0619, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0452, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0520, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0393, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0482, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0518, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0628, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0443, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0675, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0504, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0540, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0638, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0669, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0409, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0516, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0746, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0866, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0506, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0409, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0433, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0423, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0447, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0573, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0401, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0449, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0394, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0412, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0928, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0598, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0686, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0435, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0688, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0586, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0615, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0395, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0712, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0609, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0430, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0505, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0455, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0800, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0685, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0415, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0427, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0538, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0557, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0644, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0556, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0461, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0463, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0399, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0581, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0451, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0522, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0536, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0431, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0490, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0639, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0384, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0393, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0617, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0419, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0446, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0428, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0388, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0476, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0468, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0404, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0458, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0525, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0667, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0568, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0609, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0897, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0665, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0591, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0779, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0459, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0397, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0504, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0462, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0908, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0460, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0874, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0790, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0405, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0395, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0479, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0615, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0527, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0629, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0818, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0519, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0528, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0412, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0511, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0431, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0804, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0544, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0432, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0573, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0522, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0403, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0692, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0388, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0485, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0548, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0519, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0460, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1022, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0410, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0424, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0523, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0526, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0642, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0467, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0426, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0587, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0523, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0581, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0518, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0735, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0388, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0458, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0500, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0441, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0410, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0465, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0410, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0453, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0468, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0494, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0637, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0574, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0654, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0456, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0519, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0396, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0507, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0549, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0409, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0722, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0391, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0383, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0503, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0436, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0401, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0573, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0536, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0500, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0398, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0400, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0388, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0507, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0429, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0449, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0956, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0479, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0389, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0598, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0507, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0483, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0534, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0597, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0466, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0610, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0445, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0494, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0409, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0450, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0568, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0766, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0486, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0516, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0418, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0700, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0566, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0874, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0556, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0387, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0444, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0481, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0697, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0428, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0418, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0501, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0456, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0629, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0379, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0386, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0393, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0911, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0424, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0595, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0605, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0624, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0386, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0406, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0565, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0602, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0424, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0396, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0382, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0641, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0387, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0480, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0512, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0399, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0618, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0438, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0479, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0631, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0596, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0446, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0577, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0675, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0631, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0649, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0633, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0595, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0450, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0403, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0478, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0567, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0382, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0396, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0640, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0581, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0498, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0909, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0395, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0541, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0463, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0476, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0457, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0690, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0400, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0444, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0454, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0395, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0566, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0617, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0524, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0905, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0392, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0419, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0492, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0431, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0518, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0564, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0592, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0470, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0704, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0455, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0406, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0626, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0439, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0684, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0484, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0400, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0611, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0594, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0734, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0670, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0473, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0454, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0678, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0539, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0434, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0541, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0597, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0394, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0772, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0392, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0489, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0777, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0815, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0456, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0536, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0393, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0618, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0422, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0494, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0878, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0443, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0400, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0522, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0455, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0497, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0412, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0493, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0504, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0702, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0439, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0397, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0497, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0523, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0551, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0472, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0709, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0386, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0609, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0433, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0477, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0448, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0428, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0407, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0689, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0548, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0723, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0625, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0813, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0394, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0636, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0444, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0554, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0478, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0546, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0816, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0450, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0590, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0512, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0573, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0383, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0422, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0403, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0472, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0602, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0491, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0823, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0599, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0507, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0656, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0444, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0483, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0402, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0566, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0596, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0585, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0694, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0408, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0611, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0565, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0608, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0466, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0746, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0438, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0500, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0493, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0413, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0572, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0584, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0484, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0447, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0597, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0708, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0553, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0458, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0459, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0752, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0669, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0504, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0732, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0636, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0523, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0497, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0442, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0394, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0518, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0469, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0425, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0649, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0455, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0744, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0534, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0611, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0477, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0731, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0774, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0642, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0605, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0623, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0485, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0400, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0471, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0942, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0493, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0980, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0714, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0624, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0498, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0633, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0503, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0585, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0466, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0587, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0531, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0685, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0437, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0611, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0714, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0567, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0729, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0442, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0458, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0468, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0732, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0603, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0408, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0586, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0408, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0416, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0444, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0506, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0577, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0548, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0563, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0406, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0552, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0836, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0588, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0575, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0875, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0608, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0468, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0689, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0623, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0761, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0720, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0481, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0648, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0673, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0630, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0822, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0550, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0763, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0596, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0400, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0476, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0446, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0960, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0391, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1007, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0859, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0471, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0462, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0606, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0626, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0432, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0544, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0675, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0549, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0410, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0652, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0616, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0462, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0683, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0555, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0389, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0407, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0495, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0441, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0482, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0541, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0570, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0575, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0405, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0562, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0598, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0432, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0711, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0719, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0789, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0487, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0650, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0545, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0420, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0694, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0472, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0724, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0423, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0554, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0528, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0439, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0837, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0503, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0461, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0425, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0522, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0506, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0662, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0885, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0496, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0454, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0492, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0533, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0457, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0546, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0520, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0853, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0586, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0823, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0880, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0431, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0561, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0623, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0425, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0513, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0515, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0546, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0541, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0394, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0824, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0551, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0486, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0453, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0830, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0454, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0550, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0765, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0468, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0708, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0513, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0751, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0412, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0418, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0541, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0605, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0582, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0427, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0699, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0421, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0546, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0678, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0464, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0434, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0495, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0596, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0418, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0510, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0530, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0575, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0538, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0390, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0490, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0592, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0578, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1000, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0597, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0575, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0771, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0923, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0644, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0791, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0806, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0470, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0531, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0545, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0522, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0444, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0625, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0415, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0502, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0379, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0395, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0475, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0414, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0573, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0466, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0902, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0593, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0710, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0408, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0495, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0645, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0496, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0379, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0694, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0528, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0456, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0547, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0519, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0567, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0487, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0521, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0657, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0507, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0467, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0434, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0743, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0632, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0561, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0421, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0557, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0621, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0436, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0442, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0682, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0483, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0625, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0713, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0407, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0519, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0562, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0720, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0436, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0478, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0408, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0401, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0407, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0801, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0556, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0710, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0427, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0416, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0576, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0413, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0399, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0685, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0704, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0446, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0768, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0629, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0385, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0534, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0594, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0459, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0571, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0385, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0577, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0585, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0743, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0503, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0431, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0868, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0645, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0550, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0673, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0866, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0427, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0514, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0736, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0465, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0449, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0511, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0414, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0426, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0792, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0503, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0553, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0460, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0425, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0413, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0619, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0486, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0742, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0517, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0491, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0509, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0824, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0974, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0393, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0605, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0478, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0424, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0402, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0398, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0580, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0733, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0479, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0638, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0514, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0684, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0401, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0780, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0416, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0478, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0695, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0450, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0607, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0515, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0396, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0637, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0492, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0565, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0419, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0470, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0438, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0389, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0613, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0426, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0688, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0467, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0662, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0487, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0715, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0521, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0420, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0614, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0508, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0553, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0399, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0409, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0473, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0927, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0684, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0417, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0453, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0592, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0538, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0622, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0717, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0471, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0392, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0684, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0549, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0497, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0593, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0659, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0649, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0414, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0671, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0690, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0614, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0496, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0386, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0744, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0604, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0441, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0634, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0520, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0858, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0506, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0985, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0479, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0483, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0444, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0481, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0538, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0544, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0483, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0975, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0480, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0565, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0400, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0799, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0639, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0389, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0546, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0926, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0604, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0424, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0551, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0703, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0559, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0452, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0388, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0603, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0470, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0647, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0534, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0539, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0588, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0383, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0417, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0608, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0549, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0496, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0437, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0581, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0452, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0459, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0497, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0543, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0396, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0655, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0579, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0677, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0775, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0460, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0497, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0420, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0450, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0429, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0392, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0434, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0497, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0477, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0684, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0642, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0407, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0548, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0444, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0592, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0556, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0467, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0684, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0499, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0443, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0532, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0565, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0488, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0470, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0755, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0923, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0692, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0416, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0748, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0584, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0843, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0502, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0433, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0486, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0414, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0443, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0452, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0454, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0445, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0611, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0429, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0457, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0478, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0425, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0443, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0489, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0791, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0492, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0456, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0436, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0476, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0383, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0758, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0690, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0416, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0450, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0430, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0589, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0517, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0591, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0479, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0471, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0529, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0474, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0391, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0494, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0681, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0684, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0466, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0464, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0439, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0550, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0429, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0464, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0815, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0424, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0546, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0642, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0388, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0514, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0854, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0431, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0403, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0884, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0729, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0478, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0597, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0520, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0638, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0450, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0561, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0524, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0443, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0512, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0578, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0485, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0566, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0513, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0527, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0453, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0547, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0526, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0466, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0473, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0522, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0543, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0618, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0522, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0846, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0583, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0640, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0469, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0458, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0534, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0510, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0476, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0615, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0390, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0504, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0750, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0413, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0508, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0424, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0454, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0960, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0495, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0503, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0496, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0748, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0464, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0509, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0431, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0419, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0398, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0477, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0391, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0509, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0750, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0536, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0481, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0451, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0537, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0484, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0419, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0529, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0477, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0724, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0530, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0558, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0813, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0490, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0393, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0518, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0400, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0943, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0478, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0786, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0555, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0495, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0601, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0811, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0413, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0719, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0407, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0765, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0703, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0643, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0604, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0501, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0640, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0722, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0530, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0425, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0402, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0478, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0631, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0507, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0597, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0502, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0458, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0815, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0410, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0605, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0539, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0462, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0022, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0384, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0492, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0397, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0502, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0460, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0752, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0558, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0435, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0420, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0492, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0522, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0521, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0610, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0477, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0810, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0420, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0789, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0785, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0513, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0393, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0544, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0617, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0387, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0513, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0541, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0475, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0672, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0561, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0405, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0866, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0576, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0514, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0517, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0530, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0467, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0521, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0509, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0536, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0442, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0387, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0771, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0719, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0462, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0399, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0521, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0540, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0465, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0811, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0544, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0425, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0579, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0732, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0521, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0475, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0436, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0734, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0744, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0803, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0446, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0449, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0484, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0467, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0415, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0442, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0402, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0538, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0484, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0417, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0628, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0394, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0653, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0453, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0396, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0416, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0483, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0742, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0403, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0425, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0524, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0718, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0543, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0389, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0576, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0571, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0630, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0452, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0443, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0601, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0624, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0539, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0574, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0511, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0492, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0383, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0612, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0583, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0475, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0615, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0706, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0428, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0424, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0512, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0434, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0551, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0397, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0418, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0423, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0641, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0675, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0595, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0532, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0393, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0558, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0497, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0684, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0483, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0460, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0456, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0440, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0636, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0462, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0606, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0802, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0404, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0519, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0517, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0706, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0424, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0608, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0434, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0776, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0395, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0399, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0583, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0425, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0379, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0409, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0590, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0760, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0446, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0634, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0496, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0464, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0391, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0657, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0476, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0592, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0422, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0411, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0497, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0511, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0499, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0589, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0521, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0456, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0463, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0394, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0607, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0491, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0455, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0782, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0512, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0413, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0561, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0409, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0412, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0617, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0668, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0392, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0618, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0666, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0429, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0423, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0728, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0478, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0391, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0434, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0484, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0631, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0547, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0396, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0585, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0474, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0423, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0509, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0526, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0441, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0804, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0835, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0648, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0399, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0532, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0444, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0668, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0972, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0420, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0442, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0413, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0474, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0390, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0418, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0982, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0434, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0498, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0514, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0416, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0491, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0687, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0752, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0537, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0561, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0490, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0537, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0606, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0417, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0404, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0491, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0706, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0415, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0426, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0612, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0657, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0579, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0480, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0478, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0388, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0529, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0496, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0465, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0390, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0412, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0379, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0440, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0437, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0462, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0514, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0394, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0714, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0588, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0428, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0390, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0841, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0382, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0464, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0456, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0548, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0849, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0583, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0699, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0600, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0482, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0610, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0455, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0739, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0621, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0621, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0499, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0491, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0379, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0511, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0562, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0666, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0429, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0520, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0719, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0499, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0811, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0396, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0535, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0639, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0383, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0513, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0397, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0383, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0451, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0592, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0455, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0393, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0405, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0471, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0593, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0443, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0530, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0608, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0750, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0528, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0756, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0486, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0587, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0522, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0409, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0488, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0391, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0661, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0384, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0491, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0894, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0402, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0417, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0473, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0556, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0416, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0615, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0398, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0416, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0913, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0531, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0461, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0384, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0715, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0412, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0414, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0813, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0449, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0426, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0433, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0428, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0396, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0744, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0778, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0438, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0468, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0457, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0423, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0856, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0539, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0414, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0437, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0510, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0441, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0425, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0554, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0716, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0425, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0496, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0583, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0476, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0527, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0548, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0411, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0626, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0398, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0449, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0842, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0407, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0695, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0704, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0384, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0409, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0864, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0405, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0384, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0446, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0483, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0431, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0447, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0390, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0382, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0382, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0662, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0632, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0457, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0428, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0456, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0635, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0384, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0422, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0649, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0480, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0420, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0424, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0493, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0544, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0411, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0646, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0566, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0437, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0733, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0525, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0467, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0379, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0710, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0444, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0561, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0531, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0502, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0397, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0453, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0631, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0576, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0449, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0474, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0379, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0495, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0440, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0399, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0499, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0427, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0400, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0400, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0387, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0488, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0480, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0891, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0570, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0524, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0435, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0686, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0581, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0408, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0725, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0460, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0447, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0469, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0519, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0893, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0543, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0395, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0442, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0413, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0396, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0524, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0401, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0420, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0533, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0763, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0728, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0498, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0386, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0413, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0473, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0452, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0625, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0385, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0556, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0431, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0427, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0663, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0426, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0423, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0425, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0414, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0540, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0405, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0636, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0419, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0418, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0646, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0598, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0430, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0533, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0506, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0407, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0466, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0467, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0459, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0561, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0467, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0412, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0390, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0566, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0478, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0585, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0672, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0390, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0393, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0388, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0459, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0448, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0449, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0401, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0574, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0676, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0444, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0461, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0486, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0477, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0419, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0556, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0462, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0428, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0578, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0492, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0530, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0515, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0462, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0552, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0639, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0439, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0676, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0507, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0472, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0493, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0505, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0485, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0781, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0508, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0552, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0575, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0427, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0622, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0539, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0421, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0484, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0750, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0451, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0540, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0419, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0449, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0576, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0596, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0561, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0735, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0387, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0619, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0478, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0561, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0496, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0416, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0436, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0485, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0537, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0410, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0401, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0458, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0572, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0403, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0386, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0468, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0538, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0517, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0408, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0398, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0386, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0482, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0476, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0609, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0454, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0394, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0484, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0436, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0383, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0594, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0502, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0626, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0521, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0582, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0411, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0623, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0464, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0382, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0699, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0441, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0423, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0531, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0418, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0446, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0386, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0383, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0449, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0522, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0461, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0425, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0582, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0493, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0470, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0392, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0451, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0613, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0394, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0410, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0587, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0413, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0487, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0394, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0490, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0446, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0704, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0402, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0648, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0589, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0434, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0415, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0463, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0471, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0420, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0553, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0503, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0537, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0722, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0389, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0423, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0594, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0470, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0463, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0401, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0448, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0391, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0630, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0656, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0379, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0446, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0404, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0692, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0417, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0485, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0411, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0428, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0422, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0523, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0417, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0586, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0390, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0407, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0395, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0540, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0472, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0440, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0444, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0411, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0569, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0504, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0402, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0403, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0389, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0388, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0387, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0409, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0582, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0383, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0470, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0384, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0426, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0492, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0418, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0421, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0752, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0418, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0533, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0485, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0471, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0490, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0586, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0458, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0645, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0386, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0576, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0446, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0416, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0419, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0420, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0486, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0397, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0488, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0390, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0726, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0474, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0584, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0554, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0741, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0439, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0423, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0453, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0535, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0520, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0452, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0442, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0500, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0622, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0467, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0596, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0516, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0439, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0472, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0555, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0456, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0664, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0571, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0423, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0415, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0556, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0386, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0573, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0647, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0438, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0432, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0436, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0388, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0542, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0609, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0402, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0509, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0611, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0482, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0643, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0433, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0445, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0423, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0535, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0392, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0435, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0498, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0436, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0428, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0761, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0464, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0428, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0647, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0387, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0398, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0492, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0414, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0482, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0454, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0379, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0520, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0427, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0669, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0428, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0444, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0766, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0793, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0507, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0450, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0400, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0406, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0605, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0391, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0523, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0418, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0549, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0478, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0500, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0396, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0738, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0576, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0405, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0653, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0543, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0389, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0516, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0475, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0497, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0518, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0411, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0404, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0466, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0398, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0394, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0520, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0514, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0570, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0477, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0477, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0390, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0438, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0411, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0402, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0653, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0453, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0564, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0383, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0411, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0556, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0406, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0577, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0480, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0542, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0412, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0385, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0420, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0408, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0416, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0402, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0500, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0502, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0542, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0577, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0439, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0451, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0490, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0409, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0758, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0619, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0379, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0529, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0552, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0496, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0468, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0459, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0488, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0460, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0410, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0455, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0434, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0441, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0458, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0433, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0485, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0402, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0627, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0401, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0387, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0382, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0424, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0391, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0716, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0386, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0409, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0502, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0430, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0394, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0469, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0401, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0407, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0545, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0422, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0686, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0550, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0729, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0499, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0389, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0417, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0414, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0505, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0584, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0516, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0570, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0456, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0466, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0425, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0600, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0421, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0545, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0585, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0485, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0409, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0499, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0415, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0448, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0446, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0456, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0418, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0418, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0422, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0457, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0509, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0383, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0426, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0556, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0408, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0483, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0440, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0466, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0419, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0389, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0588, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0408, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0682, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0485, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0518, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0382, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0404, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0736, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0626, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0576, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0551, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0470, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0916, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0399, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0412, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0483, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0524, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0391, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0459, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0413, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0392, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0415, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0485, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0417, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0455, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0419, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0425, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0481, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0396, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0454, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0523, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0754, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0403, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0423, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0428, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0403, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0391, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0582, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0422, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0420, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0453, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0481, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0408, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0525, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0508, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0737, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0379, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0407, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0442, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0444, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0473, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0492, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0379, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0414, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0421, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0446, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0636, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0493, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0529, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0420, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0529, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0022, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0444, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0416, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0466, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0681, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0643, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0435, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0486, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0506, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0563, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0396, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0522, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0441, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0439, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0547, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0403, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0415, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0447, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0623, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0468, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0473, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0462, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0420, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0408, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0388, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0443, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0460, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0408, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0622, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0024, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0737, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0476, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0438, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0508, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0402, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0559, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0399, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0390, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0420, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0406, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0514, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0554, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0496, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0456, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0386, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0023, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0394, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0419, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0551, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0386, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0391, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0400, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0457, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0404, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0592, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0397, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0597, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0448, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0444, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0615, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0546, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0443, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0510, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0463, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0405, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0396, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0392, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0441, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0407, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0456, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0477, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0463, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0418, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0656, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0475, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0425, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0471, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0514, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0399, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0441, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0443, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0403, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0566, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0428, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0406, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0420, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0532, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0439, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0478, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0404, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0469, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0410, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0683, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0433, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0404, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0537, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0635, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0474, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0421, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0452, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0538, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0390, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0404, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0413, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0407, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0449, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0494, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0422, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0479, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0421, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0774, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0479, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0525, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0444, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0392, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0394, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0517, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0405, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0409, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0430, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0014, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0442, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0457, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0398, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0397, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0388, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0437, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0487, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0551, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0485, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0386, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0621, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0512, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0431, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0465, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0473, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0487, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0504, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0510, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0555, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0388, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0414, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0388, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0511, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0460, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0474, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0639, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0779, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0482, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0433, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0426, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0399, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0394, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0430, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0384, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0403, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0418, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0460, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0459, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0583, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0504, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0418, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0559, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0399, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0457, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0384, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0671, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0410, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0960, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0398, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0385, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0403, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0418, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0392, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0517, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0575, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0405, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0496, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0406, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0416, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0418, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0011, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0565, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0420, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0443, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0424, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0506, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0480, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0417, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0405, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0467, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0450, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0900, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0510, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0570, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0533, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0434, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0787, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0398, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0408, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0025, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0467, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0446, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0403, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0503, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0483, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0433, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0391, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0399, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0609, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0426, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0408, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0003, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0384, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0383, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0386, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0012, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0445, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0446, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0600, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0496, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0023, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0468, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0414, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0468, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0438, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0408, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0414, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0609, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0592, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0379, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0450, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0379, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0410, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0398, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0398, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0632, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0797, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0386, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0595, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0457, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0419, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0386, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0477, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0453, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0432, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0012, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0468, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0527, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0427, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0443, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0425, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0385, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0392, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0485, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0404, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0393, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0382, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0409, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0411, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0416, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0528, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0480, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0459, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0385, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0534, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0488, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0412, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0439, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0655, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0527, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0379, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0418, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0480, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0399, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0399, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0391, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0015, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0406, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0387, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0014, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0421, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0429, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0516, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0435, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0717, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0690, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0454, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0413, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0388, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0385, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0414, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0461, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0542, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0508, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0393, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0385, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0411, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0427, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0456, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0880, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0395, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0406, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0419, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0429, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0444, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0467, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0383, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0379, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0605, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0423, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0504, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0441, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0485, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0441, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0540, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0398, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0430, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0496, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0406, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0439, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0526, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0516, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0523, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0465, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0379, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0432, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0508, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0391, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0407, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0408, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0405, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0418, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0387, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0448, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0584, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0448, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0538, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0469, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0416, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0434, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0500, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0024, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0492, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0393, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0008, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0402, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0442, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0651, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0384, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0408, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0512, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0464, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0503, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0401, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0379, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0395, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0493, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0406, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0405, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0400, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0383, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0393, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0440, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0017, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0402, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0445, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0407, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0396, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0542, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0415, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0022, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0460, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0441, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0425, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0431, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0383, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0457, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0616, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0007, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0473, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0454, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0398, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0498, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0421, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0446, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0443, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0017, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0498, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0387, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0406, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0389, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0515, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0443, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0434, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0416, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0436, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0413, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0452, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0475, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0437, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0460, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0669, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0422, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0551, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0426, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0439, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0636, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0461, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0421, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0006, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0021, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0471, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0393, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0437, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0538, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0413, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0386, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0425, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(3.5851e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0430, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0529, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0400, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0520, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0535, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0390, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0448, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0433, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0442, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0463, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0021, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0391, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0379, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0390, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0400, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0423, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0573, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0481, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0512, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0434, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0502, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0397, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0447, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0502, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0510, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0389, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0022, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0402, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0430, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0391, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0462, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0389, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0705, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0424, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0542, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0388, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0464, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0446, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0394, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0450, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0452, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0384, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0382, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0526, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0452, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0420, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0655, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0639, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0473, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0414, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0508, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0451, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0548, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0387, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0532, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0432, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0442, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0411, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0403, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0464, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0389, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0438, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0398, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0435, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0666, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0433, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0517, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0422, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0515, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0430, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0464, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0524, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0392, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0379, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0414, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0449, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0485, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0421, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0528, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0407, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0464, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0411, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0535, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0394, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0383, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0475, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0426, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0412, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0483, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0488, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0489, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0399, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0013, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0022, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0415, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0395, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0024, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0435, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0520, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0022, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0425, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0522, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0403, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0528, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0492, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0411, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0425, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0523, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0475, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0425, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0448, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0427, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0404, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0454, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0386, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0379, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0404, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0431, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0014, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0388, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0422, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0389, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0435, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0543, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0393, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0020, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0465, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0393, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0461, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0427, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0644, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0410, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0486, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0017, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0622, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0462, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0406, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0390, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0439, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0387, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0629, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0492, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0012, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0597, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0473, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0396, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0510, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0484, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0516, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0008, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0384, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0427, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0454, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0419, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0414, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0022, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0442, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0514, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0516, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0382, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0634, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0393, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0549, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0438, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0485, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0484, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0420, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0510, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0389, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0023, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0425, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0420, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0531, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0416, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0476, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0443, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0391, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0434, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0498, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0596, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0453, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0436, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0443, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0410, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0456, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0441, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0412, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0519, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0392, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0419, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0386, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0471, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0398, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0433, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0382, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0444, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0424, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0385, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0403, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0466, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0437, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0528, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0513, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0441, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0572, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0017, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0391, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0602, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0414, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0416, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0421, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0498, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0426, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0418, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0621, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0398, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0410, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0479, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0428, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0432, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0415, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0424, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0417, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0393, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0462, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0413, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0807, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0450, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0457, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0485, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0438, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0484, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0391, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0448, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0411, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0512, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0405, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0591, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0398, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0431, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0511, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0390, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0017, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0396, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0379, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0007, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0461, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0421, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0479, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0493, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0492, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0542, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0426, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0437, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0559, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0542, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0735, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0384, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0455, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0564, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0021, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0483, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0469, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0025, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0447, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0543, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0624, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0775, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0516, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0406, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0611, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0499, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0789, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0453, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0418, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0464, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0024, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0839, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0019, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0433, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0412, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0605, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0894, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0494, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0014, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0387, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0767, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0387, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0390, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0503, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0853, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0025, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0472, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0386, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0421, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0400, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0395, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0493, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0410, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0506, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0463, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0522, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0492, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0562, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0005, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0475, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0448, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0430, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0006, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0495, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0448, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0523, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0591, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0707, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0499, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0405, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0449, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0463, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0608, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0379, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0542, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0749, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0407, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0420, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0508, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0470, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0017, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0595, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0428, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0425, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0461, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0473, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0475, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0572, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0388, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0796, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0410, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0563, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0393, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0431, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0622, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0402, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0437, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0411, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0434, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0437, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0422, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0533, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0387, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0009, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0024, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0708, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0388, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0530, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0441, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0415, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0463, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0444, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0721, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0524, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0467, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0009, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0442, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0404, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0518, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0523, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0391, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0021, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0452, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0023, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0529, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0451, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0432, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0393, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0395, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0426, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0386, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0387, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0467, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0591, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0012, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0661, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0501, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0562, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0520, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0477, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0658, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0397, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0394, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0392, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0724, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0490, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0024, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0397, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0433, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0565, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0654, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0385, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0023, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0386, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0466, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0439, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0464, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0379, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0566, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0434, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0399, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0385, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0511, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0741, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0390, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0439, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0386, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0411, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0395, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0405, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0504, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0411, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0474, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0463, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0387, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0636, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0602, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0520, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0592, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0407, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0615, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0447, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0394, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0717, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0473, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0441, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0415, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0508, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0468, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0500, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0462, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0729, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0405, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0441, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0472, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0443, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0462, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0484, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0387, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0517, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0696, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0389, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0505, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0403, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0809, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0018, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0490, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0421, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0456, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0557, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0410, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0012, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0498, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0476, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0389, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0501, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0025, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0387, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0443, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0024, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0487, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0016, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0382, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0411, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0610, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0491, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0510, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0400, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0404, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0475, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0445, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0454, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0696, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0391, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0419, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0507, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0618, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0451, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0432, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0022, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0391, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0461, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0480, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0405, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0695, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0025, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0546, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0385, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0469, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0473, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0416, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0456, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0392, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0502, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0421, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0420, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0450, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0504, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0438, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0714, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0454, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0417, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0025, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0699, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0743, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0398, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0489, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0401, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0010, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0637, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0606, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0669, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0712, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0447, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0589, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0452, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0743, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0379, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0809, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0403, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0590, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0499, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0441, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0406, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0395, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0572, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0398, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0404, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0666, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0616, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0404, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0407, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0388, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0409, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0504, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0491, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0541, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0535, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0402, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0442, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0485, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0390, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0020, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0470, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0462, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0521, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0441, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0491, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0021, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0514, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0446, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0010, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0664, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0497, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0020, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0574, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0384, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0705, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0661, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0409, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0411, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0482, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0384, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0402, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0385, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0470, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0421, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0456, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0439, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0633, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0008, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0733, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0460, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0454, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0440, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0642, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0448, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0417, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0499, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0414, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0022, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0480, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0470, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0415, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0470, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0556, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0454, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0756, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0469, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0450, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0482, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0477, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0400, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0456, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0555, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0411, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0450, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0639, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0610, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0500, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0473, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0446, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0443, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0407, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0846, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0388, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0403, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0384, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0494, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0562, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0503, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0481, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0449, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0394, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0416, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0008, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0507, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0401, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0424, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0473, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0527, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0403, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0412, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0541, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0397, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0500, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0011, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0545, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0386, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0399, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0419, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0677, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0449, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0407, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0432, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0427, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0412, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0454, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0404, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0396, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0658, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0444, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0432, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0025, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0402, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0446, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0411, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0421, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0431, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0458, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0545, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0500, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0546, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0467, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0415, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0449, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0021, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0644, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0409, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0448, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0439, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0412, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0492, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0459, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0019, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0423, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0479, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0379, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0019, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0489, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0020, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0436, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0395, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0396, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0542, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0444, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0013, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0463, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0411, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0011, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0466, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0518, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0628, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0488, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0565, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0490, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0409, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0394, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0488, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0432, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0592, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0391, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0476, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0502, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0468, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0426, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0460, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0507, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0446, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0022, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0425, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0458, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0426, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0448, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0391, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0577, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0427, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0410, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0619, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0671, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0549, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0412, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0445, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0492, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0535, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0389, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0383, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0454, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0391, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0566, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0480, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0411, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0387, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0389, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0389, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0404, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0469, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0478, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0558, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0485, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0657, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0020, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0397, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0585, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0599, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0397, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0562, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0557, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0382, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0457, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0414, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0472, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0613, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0461, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0492, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0548, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0396, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0493, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0412, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0549, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0574, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0657, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0424, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0397, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0382, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0386, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0505, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0478, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0416, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0433, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0460, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0542, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0402, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0435, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0383, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0466, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0434, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0410, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0392, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0454, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0382, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0025, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0619, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0409, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0460, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0567, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0667, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0018, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0509, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0492, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0400, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0462, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0407, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0425, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0391, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0583, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0421, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0383, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0606, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0393, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0628, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0477, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0412, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0388, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0435, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0606, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0486, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0410, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0409, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0384, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0416, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0400, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0525, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0517, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0396, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0410, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0405, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0411, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0417, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0442, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0443, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0521, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0433, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0399, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0411, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0420, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0472, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0398, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0410, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0407, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0437, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0471, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0461, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0405, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0513, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0446, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0479, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0415, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0405, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0604, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0388, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0415, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0447, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0727, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0404, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0419, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0399, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0396, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0407, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0409, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0536, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0024, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0414, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0428, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0575, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0398, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0412, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0449, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0458, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0487, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0523, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0399, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0437, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0470, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0426, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0466, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0993, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0540, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0422, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0416, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0468, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0407, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0392, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0391, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0420, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0385, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0426, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0454, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0590, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0491, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0414, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0402, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0429, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0386, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0449, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0387, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0530, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0459, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0481, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0430, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0469, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0389, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0020, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0388, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0401, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0490, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0396, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0470, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0485, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0471, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0426, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0516, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0471, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0472, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0418, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0525, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0395, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0401, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0403, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0454, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0385, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0403, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0446, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0403, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0424, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0387, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0422, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0015, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0411, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0475, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0439, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0394, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0391, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0383, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0397, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0402, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0549, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0408, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0515, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0432, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0444, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0447, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0436, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0461, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0425, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0402, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0584, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0390, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0397, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0414, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0622, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0466, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0500, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0506, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0505, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0581, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0411, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0422, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0017, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0385, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0428, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0493, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0427, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0404, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0014, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0409, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0451, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0388, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0713, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0491, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0431, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0472, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0452, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0651, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0406, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0533, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0390, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0594, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0546, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0387, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0560, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0442, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0427, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0475, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0567, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0427, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0459, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0427, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0432, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0391, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0441, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0497, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0500, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0387, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0417, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0529, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0556, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0539, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0452, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0478, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0582, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0557, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0414, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0405, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0446, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0388, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0427, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0552, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0394, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0457, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0419, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0397, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0489, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0410, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0459, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0453, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0528, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0473, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0382, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0414, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0460, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0454, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0408, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0025, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0459, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0436, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0620, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0435, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0465, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0444, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0431, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0488, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0423, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0471, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0458, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0447, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0401, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0505, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0505, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0923, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0563, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0581, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0394, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0419, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0519, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0479, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0382, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0477, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0402, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0397, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0449, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0498, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0400, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0550, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0389, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0404, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0389, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0479, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0439, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0383, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0587, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0386, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0385, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0422, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0481, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0416, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0391, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0393, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0448, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0400, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0398, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0493, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0397, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0523, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0024, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0437, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0385, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0389, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0422, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0389, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0453, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0507, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0025, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0494, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0528, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0478, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0478, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0379, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0390, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0383, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0409, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0430, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0022, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0442, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0582, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0615, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0462, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0470, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0386, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0587, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0454, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0393, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0390, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0465, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0522, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0636, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0452, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0417, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0450, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0495, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0569, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0387, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0418, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0011, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0425, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0401, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0455, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0409, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0465, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0395, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0428, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0415, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0385, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0395, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0393, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0480, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0389, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0500, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0472, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0395, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0482, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0432, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0526, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0555, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0473, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0023, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0408, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0540, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0747, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0388, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0401, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0442, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0430, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0408, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0454, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0480, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0471, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0586, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0616, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0483, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0572, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0413, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0400, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0443, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0442, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0408, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0415, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0386, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0441, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0390, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0391, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0409, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0448, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0583, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0538, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0385, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0922, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0452, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0409, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0421, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0386, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0611, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0449, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0487, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0447, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0446, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0705, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0507, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0441, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0543, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0518, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0424, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0415, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0402, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0413, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0507, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0447, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0527, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0451, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0396, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0437, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0489, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0441, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0428, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0664, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0424, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0441, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0540, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0548, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0465, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0379, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0399, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0405, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0413, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0383, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0467, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0442, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0472, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0383, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0403, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0449, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0022, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0461, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0519, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0503, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0437, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0391, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0471, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0397, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0379, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0554, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0414, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0626, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0403, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0427, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0468, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0394, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0391, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0552, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0518, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0865, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0426, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0474, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0505, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0020, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0007, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0013, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0404, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0490, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0526, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0414, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0529, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0474, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0458, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0550, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0411, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0457, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0430, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0554, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0460, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0447, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0388, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0481, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0478, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0432, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0482, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0395, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0436, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0434, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0400, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0530, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0423, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0020, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0480, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0398, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0389, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0014, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0453, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0393, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0383, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0383, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0025, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0570, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0392, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0531, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0441, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0619, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0384, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0389, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0401, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0430, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0015, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0495, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0507, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0419, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0384, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0006, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0422, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0496, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0379, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0437, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0387, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0421, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0011, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0497, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0504, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0023, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(2.8511e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0009, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0394, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0539, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0449, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0442, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0512, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0450, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0419, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0410, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0410, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0017, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0468, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0467, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0509, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0543, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0569, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0391, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0442, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0383, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0422, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0025, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0451, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(3.0087e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(3.0110e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0003, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0008, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0440, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0548, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0516, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0599, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0384, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0470, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0491, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0573, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0443, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0009, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0019, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0417, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0379, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0618, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(2.6848e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0022, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0471, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0009, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(3.2844e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0012, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0012, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0416, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0019, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0680, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0583, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0425, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0457, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0499, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0499, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0487, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0019, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0022, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0393, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(2.8286e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0594, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0016, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0457, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(2.8931e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0490, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0451, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0528, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(2.7127e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0025, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0018, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0011, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0764, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0409, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0384, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0462, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0006, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0518, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0687, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0423, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0465, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0484, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0007, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0389, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0432, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0456, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(3.2932e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(2.8547e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0024, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0427, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0024, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0023, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0411, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0388, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0018, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0399, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0400, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0538, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0393, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0013, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(3.0567e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0017, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(3.0164e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(3.7193e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0022, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0004, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0486, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0470, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(2.7621e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(2.6832e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0007, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0415, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0388, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0022, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0531, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0491, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0466, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0498, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0024, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0564, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0572, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(2.8466e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0011, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(2.8173e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0003, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(2.8643e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0457, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0015, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0018, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0682, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0010, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0562, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0492, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(3.3010e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0020, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0568, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0002, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(2.8316e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(2.9099e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0010, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0004, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(2.7869e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0002, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0414, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(3.0657e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0010, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0004, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0652, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0398, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0386, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0441, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0460, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0587, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0431, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0428, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0425, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0403, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0472, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0407, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0956, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0513, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0384, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0518, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0421, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0440, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0421, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0018, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0012, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(2.9186e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0013, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0016, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0014, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0390, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(2.8594e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(2.7843e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0387, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0423, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0503, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0424, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0386, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0473, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0606, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0504, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0516, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0456, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0002, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0465, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0438, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0002, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0404, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0025, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0440, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0492, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0429, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0417, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0537, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0484, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0379, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0013, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0404, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0469, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0485, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0025, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0422, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0583, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0420, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0400, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0410, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0390, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0593, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0709, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0529, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0406, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0405, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0562, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0391, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0398, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0443, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0426, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0020, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0470, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0490, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0552, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0419, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0432, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0395, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0004, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0515, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0447, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0463, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0519, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0011, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0469, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0409, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0388, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0436, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0396, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0501, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0929, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0398, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0474, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0448, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0449, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0620, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0463, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0448, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0434, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0022, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0019, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0018, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0016, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0614, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0477, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0529, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0512, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0588, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0418, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0616, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0398, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0391, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0485, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0480, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0384, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0382, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0446, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0387, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0505, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0464, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0520, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0497, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0408, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0520, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0513, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0462, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0410, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0453, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0505, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0739, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0410, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0536, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0409, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0513, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0522, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0413, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0016, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0435, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0437, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0021, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0501, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0604, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0515, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0461, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0025, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0017, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0515, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0398, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0561, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0401, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0517, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0379, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0893, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0549, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0452, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0431, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0483, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0497, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0471, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0496, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0410, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0471, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0442, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0447, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0482, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0601, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0441, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0575, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0421, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0426, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0389, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0416, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0508, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0012, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0422, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0513, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0389, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(7.0594e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0005, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0684, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0674, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0464, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0387, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0451, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0407, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0506, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0389, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0473, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0013, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0013, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0009, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0478, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0015, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0379, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0485, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0394, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0522, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0396, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0494, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0391, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0638, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0566, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0523, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0431, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0427, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0021, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0422, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0527, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0444, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0438, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0513, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0006, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0011, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0421, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(3.0813e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0023, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0403, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0487, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0400, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0382, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0414, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1002, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0690, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0019, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0448, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0414, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0549, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0479, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0991, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0619, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0771, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0388, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0394, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0416, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0825, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0563, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0423, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0021, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0006, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0012, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0424, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0435, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0410, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0021, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0409, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0427, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0408, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0482, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0610, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0426, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0571, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0655, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0017, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0386, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0023, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0487, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0510, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0394, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(8.3259e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0440, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0007, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(2.1706e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(1.9983e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0015, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0010, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0004, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0024, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0621, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0488, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0021, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0602, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0511, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0003, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0013, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0019, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0429, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0384, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0400, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0571, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0403, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0461, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0013, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0533, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0485, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0393, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0438, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0512, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0385, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0500, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0394, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0462, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0019, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(2.3609e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0389, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0606, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0628, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0379, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0396, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0389, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0388, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0386, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0383, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0383, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0438, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0445, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0024, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0419, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0504, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0431, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0393, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0447, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0429, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0418, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0461, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0450, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0406, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0401, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0535, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0397, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0025, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0007, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0013, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0504, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(2.4808e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0012, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0485, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0016, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0475, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0475, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0432, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0723, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0672, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0391, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0391, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0627, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0557, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0390, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0606, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0484, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0435, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0788, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0463, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0385, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0424, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0434, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0398, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0412, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0503, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0405, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0414, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0025, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0385, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0505, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0587, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0404, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0598, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0484, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0430, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0430, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0388, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0516, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0420, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0396, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0564, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0443, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0463, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0405, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0589, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0404, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0395, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0390, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0656, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0025, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0020, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0592, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0427, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0465, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0014, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0849, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0424, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0439, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0524, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0679, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0417, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0485, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0544, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0473, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0396, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0491, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0385, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0421, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0931, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0383, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0470, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0418, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0404, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0415, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(6.2992e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0455, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0410, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0653, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0480, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0463, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(7.5376e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0406, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0858, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0462, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0421, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0382, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0476, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0403, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0409, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0453, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0419, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0424, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0479, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(4.8388e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0397, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0505, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0815, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0458, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0017, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0584, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0456, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(8.2431e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0004, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0008, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0653, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0474, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0417, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0404, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0531, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0437, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0530, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0407, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0568, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0408, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0396, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0502, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0388, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0540, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0466, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0433, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0435, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0411, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0389, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0557, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0021, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0436, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0427, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0639, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0435, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0463, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0513, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0389, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0015, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0417, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0426, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0533, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0379, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0403, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0426, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0555, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0572, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0020, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0007, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0430, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0407, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0464, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0014, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(4.7609e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0485, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0424, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0019, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0007, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0382, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0400, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0850, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0397, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0411, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0558, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0392, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0382, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0440, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0459, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0433, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0733, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0025, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0850, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0393, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0496, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0014, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0513, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0003, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0014, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0673, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0437, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0432, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0427, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0422, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0493, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0383, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0426, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0022, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0020, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0015, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0517, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0504, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0591, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0405, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0517, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0397, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0428, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(4.6175e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(9.8786e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0388, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0464, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0511, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0552, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0748, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0484, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0388, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0388, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0536, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0563, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0013, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0015, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0442, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0403, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0493, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0410, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0476, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0412, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0439, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0393, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0521, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0428, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0391, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0635, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0017, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0505, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0402, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0391, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0445, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0412, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0517, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0414, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0460, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0412, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0421, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0448, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0475, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0613, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0442, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0005, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0422, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0471, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0575, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0450, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0547, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0482, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0473, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0812, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0550, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0407, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0578, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0388, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0563, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0020, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0386, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0579, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0505, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0484, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0643, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0387, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0423, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0391, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0472, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0392, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0611, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0392, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0456, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0384, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0573, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0424, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0498, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0627, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0391, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0411, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0501, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0604, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0447, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0507, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0395, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0018, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0677, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0456, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0488, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0388, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0392, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0386, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0401, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0405, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0494, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0406, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0636, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0630, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0548, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0393, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0578, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0615, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0383, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0466, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0445, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0590, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0468, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0387, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0385, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0422, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0707, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0383, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0542, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0448, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0379, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0567, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0417, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0396, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0422, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0432, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0018, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0429, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0467, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0500, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0558, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0439, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0429, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0384, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0411, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0393, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0760, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0416, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0493, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0510, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0427, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0526, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0548, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0410, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1022, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0415, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0018, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0396, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0410, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0413, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0943, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0515, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0432, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0398, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0427, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0420, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0477, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0522, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0018, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0489, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0479, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0620, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0379, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0655, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0834, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0460, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0545, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0482, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0525, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0589, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0451, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0387, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0024, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0480, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0560, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0714, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0015, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0651, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0383, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0594, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0571, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0396, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0399, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0785, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0691, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0401, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0023, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0640, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0382, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0940, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0454, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0459, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0382, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0489, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0411, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0719, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0422, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0407, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0383, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0427, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0386, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0446, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0410, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0385, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0479, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0485, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0391, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0454, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0428, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0395, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0554, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0392, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0431, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0520, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0431, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0525, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0387, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0008, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0651, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0405, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0513, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0400, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0487, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0407, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0551, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0003, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0529, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0385, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0670, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0462, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0609, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0585, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0384, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0461, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0383, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0472, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0428, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0551, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0013, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0450, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0413, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0387, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0391, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0558, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0535, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0509, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0389, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0418, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0454, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0551, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0408, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0474, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0401, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0500, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0520, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0535, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0384, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0486, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0478, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0379, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0533, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0406, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0524, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0487, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0496, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0511, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0403, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0399, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0605, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0412, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0721, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0529, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0432, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0563, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0464, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0447, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0391, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0409, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0934, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0018, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0382, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0383, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0467, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0450, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0637, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0412, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0608, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0479, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0389, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0688, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0412, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0442, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0398, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0608, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0406, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0385, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0468, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0445, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0402, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0467, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0561, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0444, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0571, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0409, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0408, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0392, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0438, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0413, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0386, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0465, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0399, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0539, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0392, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0579, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0501, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0385, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0487, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0455, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0439, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0390, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0533, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0391, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0431, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0553, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0545, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0415, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0441, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0415, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0020, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0021, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0524, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0435, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0015, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0463, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0453, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0429, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0439, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0385, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0456, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0406, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0385, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0630, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0448, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0409, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0413, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0400, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0486, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0411, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0440, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0605, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0581, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0573, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0385, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0385, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0394, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0421, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0396, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0379, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0383, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0404, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0503, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0480, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0453, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0407, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0393, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0467, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0485, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0504, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0412, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0471, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0403, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0480, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0462, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0487, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0403, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0478, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0393, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0414, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0422, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0543, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0824, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0602, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0382, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0474, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0394, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0483, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0444, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0514, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0392, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0516, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0431, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0509, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0418, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0459, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0493, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0386, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0392, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0526, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0511, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0540, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0563, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0644, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0415, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0463, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0394, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0396, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0600, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0415, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0504, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0427, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0011, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0388, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0487, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0690, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0408, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0399, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0450, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0385, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0538, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0605, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0474, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0570, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0593, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0417, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0438, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0461, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0415, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0428, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0456, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0452, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0507, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0387, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0023, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0408, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0408, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0432, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0383, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0394, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0395, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0398, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0416, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0389, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0393, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0421, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0432, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0627, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0428, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0474, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0444, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0395, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0388, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0471, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0423, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0473, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0411, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0428, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0392, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0406, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0505, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0440, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0412, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0398, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0447, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0022, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0454, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0422, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0389, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0455, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0421, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0420, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0449, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0401, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0398, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0959, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0411, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0415, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0439, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0402, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0440, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0511, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0018, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0384, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0420, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0396, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0463, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0411, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0518, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0405, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0022, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0388, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0405, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0421, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0447, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0482, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0387, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0531, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0389, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0628, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0409, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0412, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0419, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0392, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0414, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0467, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0592, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0412, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0022, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0512, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0387, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0439, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0439, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0464, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0775, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0403, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0417, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0407, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0668, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0438, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0610, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0427, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0414, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0409, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0397, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0379, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0526, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0630, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0414, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0445, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0450, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0579, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0413, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0403, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0735, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0413, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0589, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0402, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0443, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0384, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0003, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0415, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0403, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0388, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0382, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0557, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0407, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0617, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0420, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0480, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0479, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0540, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0528, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0405, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0409, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0679, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0445, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0008, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0442, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0430, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0649, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0488, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0493, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0480, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0429, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0492, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0384, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0404, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0411, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0390, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0473, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0699, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0413, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0430, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0647, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0505, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0396, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0630, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0431, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0456, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0388, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0513, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0020, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0382, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0411, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0387, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0438, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0548, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0385, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0551, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0576, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0415, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0578, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0442, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0431, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0391, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0476, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0503, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0440, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0656, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0616, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0424, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0403, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0399, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0516, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0532, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0592, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0423, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0418, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0414, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0382, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0612, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0021, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0413, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0531, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0592, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0014, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0520, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0418, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0012, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0397, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0533, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0451, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0423, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0409, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0419, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0642, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0394, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0022, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0015, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0691, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0419, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0398, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0440, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0468, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0383, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0460, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0409, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0430, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0477, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0396, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0455, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0476, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0020, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0403, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0396, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0451, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0450, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0017, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0475, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0426, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0674, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0402, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0418, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0459, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0507, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0466, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0409, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0470, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0794, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0413, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0414, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0585, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0594, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0410, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0518, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0410, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0004, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0392, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0794, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0415, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0402, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0529, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0412, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0461, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0544, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0523, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0423, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0014, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0465, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0445, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0433, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0436, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0417, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0392, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0468, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0399, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0392, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0484, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0382, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0478, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0552, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0408, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0385, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0430, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0507, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0402, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0442, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0500, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0487, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0416, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0419, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0406, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0478, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0401, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0653, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0392, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0499, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0416, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0439, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0014, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0467, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0542, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0399, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0534, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0022, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0401, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0434, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0388, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0547, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0009, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0022, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0484, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0472, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0483, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0446, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0473, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0436, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0418, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0422, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0449, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0498, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0845, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0517, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0879, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0440, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0567, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0396, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0765, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0436, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0382, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0424, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0402, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0396, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0423, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0528, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0433, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0412, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0451, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0625, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0566, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0396, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0415, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0589, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0527, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0424, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0479, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0429, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0427, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0530, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0422, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0492, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0546, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0427, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0435, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0560, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0392, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0428, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0379, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0386, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0394, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0393, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0453, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0396, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0409, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0432, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0384, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0388, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0572, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0505, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0598, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0434, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0413, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0396, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0448, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0421, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0399, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0585, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0415, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0385, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0017, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0020, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0431, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0571, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0996, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0412, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0553, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0413, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0679, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0402, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0592, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0590, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0445, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0433, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0410, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0399, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0391, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0641, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0500, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0410, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0025, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0465, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0025, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0476, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0411, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0504, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0524, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0400, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0400, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0439, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0412, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0466, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0479, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0397, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0464, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0485, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0540, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0384, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0384, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0023, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0390, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0384, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0388, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0896, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0440, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0379, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0473, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0019, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0506, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0418, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0463, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0021, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0391, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0554, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0386, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0019, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0382, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0475, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0686, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0444, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0437, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0018, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0642, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0668, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0415, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0390, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0023, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0497, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0675, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0640, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0522, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0607, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0540, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0437, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0405, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0006, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0003, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0025, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0471, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0399, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0466, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0389, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0448, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0434, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0409, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0393, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0402, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0481, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0397, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0484, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0014, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0432, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0424, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0644, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0794, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0006, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0462, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0453, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0424, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0404, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0541, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0442, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0402, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0522, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0416, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0412, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0397, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0587, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0499, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0396, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0439, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0399, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0573, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0419, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0421, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0560, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0386, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0400, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0422, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0416, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0403, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0420, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0426, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0020, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0441, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0469, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0471, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0478, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0483, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0421, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0413, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0779, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0397, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0386, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0424, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0553, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0574, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0430, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0428, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0421, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0385, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0020, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0384, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0418, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0641, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0468, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0419, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0397, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0387, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0022, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0496, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0399, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0021, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(1.5047e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0401, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0828, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0497, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0398, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0509, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0432, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0426, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0395, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0760, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0476, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0665, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0379, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0022, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0020, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0504, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0473, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0464, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0394, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0432, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0455, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0392, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0481, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0392, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0425, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0382, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0424, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0482, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0457, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0404, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0500, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0471, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0464, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0398, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0490, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0505, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0423, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0391, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0688, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0422, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0519, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0389, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0014, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0019, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0002, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0011, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0009, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0014, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0584, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0552, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(9.6202e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0022, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0492, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0014, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0392, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0503, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0461, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0435, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0414, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0410, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0418, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0004, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0384, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0465, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0462, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0409, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0570, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0462, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0445, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0443, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0399, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0976, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0413, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0409, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0447, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0448, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0382, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(1.4568e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0456, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0551, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0482, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0394, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0402, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0449, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0457, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0432, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0392, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0596, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0484, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0446, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0528, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(2.7201e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0001, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0022, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0019, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0016, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0015, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0004, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(1.0019e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0435, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0410, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0451, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0454, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0544, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0014, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0002, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0016, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0522, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0432, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0785, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0392, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0422, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0020, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0739, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0397, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0513, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0532, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0382, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0379, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0418, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0532, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0427, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0402, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0496, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0581, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0021, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0397, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0386, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0405, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0407, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0017, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0456, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0525, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0445, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0519, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0021, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0397, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0481, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0490, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0611, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0024, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0019, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0540, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0477, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0382, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0410, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0494, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0740, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0482, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0706, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0404, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(7.0267e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0386, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0013, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0003, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(6.3506e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0017, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0462, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0397, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0566, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0672, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0472, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0776, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0474, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0024, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0004, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0020, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0014, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0021, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0016, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(2.0688e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0382, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0005, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0431, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0014, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0005, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0021, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0001, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0403, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0518, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0012, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0458, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0005, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0756, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0025, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0016, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0400, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0385, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0770, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0382, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0465, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0379, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0388, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0487, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0383, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0735, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0440, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(2.3157e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0553, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0546, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0395, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0383, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0025, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0528, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0467, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0496, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0430, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0443, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0404, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0503, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0571, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0389, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0018, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0810, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0393, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0499, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0009, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0561, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0395, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0024, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0481, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0403, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0490, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0012, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0446, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0017, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0009, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0003, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0002, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0007, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(9.1117e-06, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0011, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0388, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0403, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0003, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0024, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(1.1685e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0008, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0393, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(2.8342e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0006, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0008, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(1.4518e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0022, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0003, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0001, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0023, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0007, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(1.1684e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0005, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0004, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0020, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0021, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0024, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0002, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0481, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0018, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(3.4582e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(8.7722e-06, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(8.7722e-06, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(8.7706e-06, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0024, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0482, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0406, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0496, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(1.7233e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0527, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0563, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0409, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0420, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0546, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0398, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0700, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0396, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0384, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0395, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0411, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0417, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0498, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0612, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0393, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0444, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0386, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0379, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0458, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0461, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0485, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0024, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0438, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(1.8789e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0017, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0010, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0022, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0003, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(2.9052e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0001, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0004, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0001, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(1.4129e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0011, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0012, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0015, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0005, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(9.7821e-06, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(4.5204e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0015, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0015, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0015, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0009, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0004, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(3.8825e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0023, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0010, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(1.1937e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0025, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(2.6439e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(4.6149e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0010, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0001, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0017, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0021, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0008, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0011, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0016, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0013, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0003, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0013, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0013, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0018, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(2.0078e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0013, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0009, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(1.3407e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(8.0366e-06, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0002, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(1.0054e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(9.0303e-06, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0015, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(1.1431e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(3.9546e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0009, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0005, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(8.2234e-06, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0525, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0666, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0003, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0003, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0003, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0003, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0003, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0003, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0003, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0003, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0003, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0003, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0003, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0003, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0003, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0403, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0003, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0547, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0019, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0423, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0507, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0520, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0399, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0679, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0466, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0488, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0023, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0689, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0397, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0388, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0008, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0008, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0018, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0020, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0402, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(3.4324e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0006, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0016, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0005, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0025, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0503, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0024, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0003, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0004, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0016, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0007, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0006, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0009, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0009, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0007, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0025, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0515, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0011, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0445, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0002, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0014, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0021, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0006, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0006, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0007, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0405, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0641, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0397, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0486, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0013, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(6.5223e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0389, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0385, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0409, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0019, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0487, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0539, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0389, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0006, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0021, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0387, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0017, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0437, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0456, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0538, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0406, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0609, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0004, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0008, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(9.8825e-06, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0007, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0025, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0002, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(6.6283e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(8.4739e-06, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0010, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0021, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0024, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0018, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0021, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0017, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0379, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0012, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(5.1296e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0012, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0414, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0011, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0010, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0007, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0023, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0603, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0020, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0482, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0433, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0441, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0012, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0007, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0440, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0385, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0664, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0398, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0517, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0495, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0469, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0455, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0427, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0023, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0605, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0386, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0555, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0480, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0393, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0011, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0627, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0413, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0515, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0565, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0417, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0512, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0014, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0023, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0003, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0025, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0005, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0020, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0423, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0444, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0396, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0617, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0421, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0021, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0416, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0017, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0408, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0407, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0481, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0410, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0558, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0615, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0444, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0441, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0387, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0383, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0564, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0629, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0435, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0386, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0409, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0416, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0443, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0385, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0433, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0438, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0457, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0412, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0023, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0883, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0664, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0464, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0512, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0584, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0388, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0392, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0454, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0445, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0425, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0598, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0600, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0387, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0416, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0430, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0559, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0431, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0385, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0458, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0465, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0400, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0475, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0532, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0403, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0511, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0404, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0419, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0392, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0414, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0388, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0504, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0544, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0534, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0777, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0387, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0515, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0011, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0411, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0388, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0399, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0413, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0409, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0492, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0448, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0433, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0395, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0021, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0408, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0547, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0013, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0393, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0025, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0389, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0421, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0431, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0611, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0472, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0430, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0441, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0390, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0476, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0383, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0389, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0561, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0394, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0406, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0386, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0444, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0399, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0388, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0487, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0405, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0438, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0393, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0455, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0389, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0455, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0497, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0518, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0452, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0407, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0406, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0403, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0556, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0400, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0450, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0589, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0384, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0520, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0485, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0453, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0462, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0427, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0411, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0397, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0415, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0413, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0395, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0593, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0401, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0460, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0424, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0406, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0431, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0450, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0538, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0394, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0387, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0391, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0460, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0379, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0453, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0721, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0398, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0465, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0516, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0465, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0382, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0426, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0431, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0647, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0446, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0487, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0451, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0500, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0602, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0547, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0484, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0585, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0498, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0498, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0468, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0432, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0414, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0556, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0495, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0635, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0449, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0694, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0582, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0423, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0020, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0565, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0413, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0556, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0418, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0468, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0497, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0505, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0391, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0398, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0482, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0475, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0421, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0552, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0668, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0010, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0445, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0422, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0428, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0391, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0448, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0461, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0802, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0416, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0449, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0390, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0544, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0009, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0025, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0514, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0712, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0409, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0466, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0421, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0419, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0017, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0409, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0438, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0455, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0425, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0432, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0395, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0429, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0400, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0557, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0520, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0023, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0481, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0012, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0385, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0494, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0397, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0404, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0394, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0465, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0428, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0482, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0005, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0587, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0008, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0996, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0440, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0496, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0407, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0020, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0408, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0415, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0382, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0425, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0399, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0474, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0418, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0382, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0408, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0458, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0399, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0389, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0587, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0020, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0492, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0443, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0470, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0508, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0521, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0006, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0407, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0017, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0592, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0485, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0399, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0406, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0500, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0418, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0420, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0435, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0479, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0443, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0428, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0480, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0410, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0407, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0435, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0405, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0463, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0460, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0024, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0494, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0385, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0632, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0409, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0025, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0450, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0388, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0448, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0431, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0414, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0467, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0397, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0403, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0379, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0414, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0385, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0404, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0411, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0384, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0013, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(5.7050e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0021, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0397, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0578, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0412, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0384, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0023, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0418, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0009, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0471, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0453, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0466, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0473, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0500, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0568, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0389, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0495, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0437, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0386, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0384, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0582, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0499, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0423, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0419, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0386, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0394, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0393, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0407, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0019, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0396, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0416, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0549, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0422, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0414, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0455, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0394, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0658, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0639, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0482, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0382, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0405, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0495, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0561, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0013, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0534, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0448, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0394, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0446, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0432, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0397, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0449, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0006, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0492, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0447, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0491, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0018, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0386, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0446, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0690, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0584, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0013, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0442, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0402, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0418, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0404, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0555, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0385, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0402, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0438, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0013, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0507, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0015, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0515, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0419, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0423, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0426, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0479, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0703, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0468, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0440, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0404, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0017, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0408, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0471, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0567, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0461, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0476, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0420, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0025, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0015, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0389, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0440, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0441, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0398, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0448, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0462, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0494, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0450, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0021, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0024, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0464, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0398, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0399, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0023, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0449, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0467, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0015, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0423, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0547, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0766, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0391, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0407, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0400, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0460, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0022, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0439, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0418, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0470, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0410, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0417, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0421, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0438, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0444, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0519, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0011, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0437, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0551, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0023, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0389, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0667, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0473, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0447, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0417, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0521, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0406, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0022, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0385, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0410, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0023, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0391, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0458, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0387, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0394, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0398, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0382, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0385, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0401, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0022, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0461, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0440, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0721, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0447, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0469, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0024, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0410, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0424, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0542, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0405, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0453, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0382, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0402, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0390, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0426, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0451, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0009, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0524, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0543, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0017, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0574, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0476, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0443, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0422, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0407, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0478, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0021, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0403, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0477, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0658, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0644, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0474, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0454, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0405, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0025, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0411, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0399, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0594, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0619, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0413, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0025, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0442, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0405, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0008, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0473, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0405, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0012, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0023, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0022, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0417, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0535, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0009, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0428, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0423, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0443, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0015, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0025, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0406, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0399, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0463, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0442, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0409, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0424, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0431, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0439, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0494, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0024, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0387, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0011, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0414, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0395, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0499, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0566, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0391, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0392, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0393, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0406, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0420, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0523, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0431, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0012, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0475, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0417, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0020, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0016, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0688, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0007, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0500, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0446, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0014, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0443, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0388, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0406, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0442, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0468, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0013, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0005, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0497, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0447, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0414, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0632, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0402, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0416, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0449, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0025, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0398, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0520, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0024, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0442, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0463, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0390, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0400, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0019, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0011, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0385, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0466, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0009, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0426, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0399, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0491, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0400, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0010, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0418, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0415, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0003, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0485, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0022, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0017, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0012, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0449, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0021, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0011, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0446, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(1.9563e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(5.4081e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0405, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0004, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0010, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0022, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0452, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0411, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0543, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0445, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0008, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0416, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0012, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0004, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0023, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0421, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0450, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0025, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0019, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0426, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0018, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(9.3571e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0003, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0007, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0016, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0016, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0406, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0395, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0021, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0562, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0556, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0618, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0409, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0002, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0016, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0520, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0379, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0015, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0505, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0017, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0418, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0010, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0023, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0011, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0006, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0011, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0710, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0021, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0395, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0013, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0009, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0006, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(3.1799e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(3.6445e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0005, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(2.3570e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(2.0488e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(2.0712e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0006, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0009, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0007, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(2.0429e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0003, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(2.5746e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(2.1403e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0005, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0013, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(2.7602e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0003, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(2.8429e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0002, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(2.2857e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(3.6774e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(2.1924e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0003, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(3.3231e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(2.4141e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(3.4153e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(2.5524e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(3.4847e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(2.0927e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(2.7494e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(2.1050e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(3.3134e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0005, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(2.0588e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(2.0289e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(2.4124e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(2.1924e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0002, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(2.2503e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(2.1788e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(2.2742e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(2.7041e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(2.2099e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(2.3180e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(2.0504e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(2.1107e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(1.9947e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0003, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(4.1686e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(3.5921e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0003, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(3.1507e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(1.9780e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(2.7879e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0010, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(2.5168e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(6.8908e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(2.2656e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(2.5809e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0007, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(2.2798e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(1.9489e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(1.9326e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(2.1872e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(2.7967e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0017, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(1.8997e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(2.3668e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(6.9351e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0009, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(2.2431e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0007, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(2.6373e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(1.9290e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(2.1436e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(2.6505e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(2.1068e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(2.2309e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(2.2582e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0003, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0005, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(3.6274e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(1.9817e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(3.8652e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0005, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(1.8589e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(2.3048e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0002, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0005, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0005, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0012, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(2.1623e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0002, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0003, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(1.9595e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0002, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0001, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0002, device='cuda:0', grad_fn=<NllLossBackward>)
--------------EPOCH 6-------------
Test Accuracy: tensor(0.4344, device='cuda:0')
Loss: tensor(0.0936, device='cuda:0', grad_fn=<NllLossBackward>)
Saving model at iteration: 0
Loss: tensor(0.0527, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0442, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0556, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0546, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0409, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0513, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0486, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0536, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0454, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0662, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0465, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0442, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0382, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0860, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0766, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0403, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0883, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0396, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0499, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0499, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0541, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0703, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0502, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0540, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0683, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0559, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0431, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0498, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0510, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0407, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0938, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0490, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0434, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0435, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0458, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0424, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0478, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0430, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0434, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0410, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0507, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0721, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0438, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0481, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0773, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0435, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0835, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0516, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0564, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0422, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0480, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0467, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0674, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0563, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0530, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0671, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0462, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0546, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0423, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0749, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0677, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0621, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0426, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0486, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0530, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0398, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0450, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0464, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0656, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0424, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0494, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0429, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0449, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0405, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0383, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0563, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0543, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0703, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0429, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0465, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0412, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0876, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0562, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0658, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0447, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0489, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0444, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0837, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0388, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0426, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0583, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0542, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0741, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0600, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0706, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0539, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0648, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0720, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0436, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0625, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0410, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0501, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0483, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0464, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0532, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0485, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0688, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0489, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0657, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0461, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0659, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0394, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0461, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0450, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0422, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0461, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0653, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0384, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0408, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0388, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0408, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0444, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0504, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0481, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0393, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0535, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0599, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0449, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0625, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0557, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0959, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0498, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0577, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0429, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0411, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0587, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0679, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0389, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0499, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0518, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0473, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0585, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0461, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0678, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0664, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0467, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0569, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0662, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0735, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0545, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0597, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0603, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0534, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0420, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0436, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0399, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0502, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0817, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0629, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0466, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0485, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0459, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0656, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0720, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0393, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0405, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0512, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0480, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0639, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0426, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0551, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0414, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0511, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0624, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0463, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0447, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0617, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0636, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0553, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0492, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0543, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0479, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0508, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0397, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0663, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0499, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0468, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0393, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0553, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0622, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0535, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0392, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0528, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0400, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0385, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0435, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0450, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0411, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0467, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0501, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0577, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0480, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0629, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0453, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0515, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0426, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0485, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0389, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0410, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0485, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0447, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0491, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0438, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0518, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0402, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0744, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0468, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0408, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0389, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0429, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0474, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0488, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0620, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0480, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0974, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0591, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0428, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0523, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0455, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0428, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0504, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0413, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0596, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0651, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0665, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0561, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0427, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0660, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0519, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0710, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0419, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0392, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0429, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0534, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0391, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0630, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0592, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0585, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0430, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0440, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0379, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0552, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0642, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0417, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0529, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0421, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0398, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0503, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0491, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0645, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0573, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0623, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0524, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0807, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0455, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0379, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0481, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0388, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0457, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0439, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0467, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0594, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0496, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0681, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0981, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0474, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0401, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0426, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0393, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0633, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0471, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0934, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0740, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0702, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0820, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0562, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0426, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0542, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0519, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0407, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0442, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0446, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0453, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0540, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0382, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0504, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0491, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0645, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0708, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0480, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0575, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0462, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0408, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0513, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0435, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0805, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0474, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0421, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0476, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0659, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0522, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0468, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0585, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0552, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0602, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0711, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0503, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0513, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0573, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0595, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0442, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0645, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0383, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0448, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0454, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0434, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0422, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0475, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0823, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0415, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0645, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0407, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0735, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0619, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0591, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0623, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0497, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0594, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0585, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0761, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0689, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0540, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0418, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0427, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0563, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0481, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0449, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0403, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0429, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0533, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0825, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0506, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0507, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0504, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0593, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0658, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0456, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0643, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0387, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0592, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0405, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0436, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0461, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0542, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0386, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0382, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0621, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0440, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0574, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0490, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0432, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0416, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0438, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0471, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0773, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0486, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0500, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0479, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0498, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0588, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0401, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0718, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0632, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0416, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0500, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0599, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0586, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0663, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0413, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0750, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0701, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0449, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0520, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0412, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0528, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0670, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0522, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0589, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0412, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0721, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0491, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0432, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0668, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0460, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0483, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0528, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0393, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0627, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0540, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0781, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0392, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0650, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0479, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0730, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0668, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0462, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0429, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0515, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0503, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0692, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0466, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0534, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0390, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0618, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0500, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0432, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0492, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0496, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0444, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0533, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0424, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0442, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0424, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0569, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0404, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0504, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0389, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0413, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0577, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0566, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0807, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0688, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0388, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0984, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0549, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0448, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0530, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0540, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0562, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0636, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0562, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0389, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0392, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0469, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0396, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0384, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0571, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0581, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0756, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0512, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0491, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0507, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0569, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0544, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0487, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0531, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0945, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0672, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0389, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0529, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0496, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0464, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0567, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0520, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0502, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0427, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0472, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0549, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0409, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0466, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0642, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0551, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0558, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0630, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0561, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0550, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0449, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0430, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0460, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0440, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0587, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0556, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0555, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0658, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0499, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0545, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0562, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0561, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0484, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0862, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0677, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0530, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0486, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0480, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0476, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0449, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0391, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0389, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0748, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0430, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0644, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0573, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0482, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0678, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0580, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0385, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0507, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0875, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0632, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0422, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0506, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0429, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0558, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0755, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0509, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0548, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0845, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0501, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0609, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0424, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0522, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0573, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0740, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0458, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0608, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0652, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0399, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0447, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0482, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0832, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0549, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0582, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0538, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0753, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0566, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0605, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0707, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0467, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0477, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0423, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0535, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0571, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0471, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0580, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0482, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0561, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0785, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0482, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0583, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0658, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0489, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0558, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0420, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0398, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0404, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0702, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0464, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0386, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0478, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0517, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0666, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0384, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0727, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0413, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0892, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0535, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0830, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0419, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0728, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0520, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0445, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0384, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0910, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0414, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0530, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0449, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0525, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0405, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0434, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0402, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0410, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0693, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0622, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0583, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0405, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0439, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0414, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0526, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0708, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0443, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0527, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0414, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0622, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0488, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0652, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0382, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0610, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0864, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0517, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0422, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0550, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0569, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0750, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0551, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0431, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0674, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0673, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0597, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0430, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0571, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0475, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0419, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0417, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0527, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0531, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0616, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0493, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0555, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0463, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0421, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0396, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0719, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0499, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0395, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0584, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0565, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0929, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0539, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0643, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0387, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0452, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0460, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0604, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0453, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0502, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0578, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0746, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0547, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0477, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0426, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0518, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0391, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0420, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0491, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0408, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0435, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0491, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0634, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0473, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0555, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0484, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0454, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0550, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0538, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0637, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0472, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0435, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0638, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0617, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0502, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0702, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0398, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0391, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0554, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0448, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0453, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0423, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0477, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0456, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0575, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0527, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0421, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0428, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0865, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0573, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0617, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0451, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0539, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0662, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0416, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0512, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0402, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0565, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0483, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0444, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0423, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0445, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0488, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0615, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0416, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0389, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0463, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0481, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0450, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0417, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0478, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0563, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0713, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0505, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0457, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0633, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0521, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0440, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0722, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0466, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0542, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0430, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1018, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0701, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0540, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0412, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0401, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0662, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0466, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0713, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0485, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0679, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0532, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0500, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0644, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0565, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0382, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0614, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0727, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0732, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0393, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0484, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0466, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0528, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0721, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0568, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0670, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0388, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0483, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0716, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0517, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0384, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0432, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0453, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0658, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0511, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0577, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0760, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0544, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0589, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0428, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0453, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0506, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0542, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0400, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0500, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0480, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0487, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0688, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0568, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0430, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0429, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0812, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0440, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0685, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0648, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0391, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0482, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0708, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0808, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0409, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0534, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0724, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0630, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0631, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0402, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0523, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0422, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0557, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0423, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0505, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0495, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0714, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0422, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0582, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0417, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0422, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0480, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0510, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0408, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0470, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0539, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0653, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0645, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0562, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0536, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0698, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0476, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0498, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0558, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0747, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0537, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0416, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0526, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0447, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0467, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0467, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0538, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0464, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0666, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0428, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0482, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0430, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0607, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0413, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0582, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0443, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0382, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0432, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0396, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0483, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0656, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0546, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0694, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0526, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0482, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0821, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0527, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0567, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0408, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0633, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0609, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0565, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0595, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0408, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0570, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0442, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0450, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0438, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0654, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0607, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0450, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0542, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0440, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0534, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0450, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0467, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0924, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0424, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0473, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0435, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0678, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0508, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0468, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0604, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0459, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0561, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0462, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0569, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0495, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0461, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0421, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0434, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0419, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0467, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0598, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0467, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0482, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0391, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0400, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0563, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0603, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0466, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0634, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0438, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0394, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0479, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0672, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0453, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0517, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0455, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0472, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0949, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0428, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0388, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0609, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0418, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0387, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0392, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0605, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0536, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0476, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0606, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0428, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0549, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0568, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0591, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0535, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0572, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0638, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0434, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0516, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0551, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0598, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0606, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0405, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0871, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0504, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0479, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0485, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0509, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0482, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0767, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0508, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0506, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0571, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0513, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0628, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0658, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0461, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0528, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0425, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0433, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0605, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0400, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0484, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0474, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0431, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0434, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0395, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0607, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0604, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0458, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0527, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0446, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0569, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0444, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0540, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0478, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0612, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0489, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0579, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0580, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0710, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0782, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0688, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0386, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0612, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0608, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0392, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0506, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0741, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0399, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0423, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0599, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0641, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0492, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0394, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0976, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0520, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0454, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0394, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0456, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0549, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0484, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0455, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0921, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0388, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0487, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0466, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0535, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0600, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0379, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0416, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0480, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0493, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0451, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0444, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0501, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0406, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0692, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0536, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0399, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0555, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0490, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0404, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0792, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0839, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0498, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0396, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0395, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0410, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0743, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0803, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0567, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0740, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0390, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0516, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0459, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0558, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0693, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0475, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0860, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0438, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0440, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0451, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0557, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0483, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0507, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0410, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0620, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0412, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0704, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0676, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0606, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0516, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0733, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0542, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0528, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0425, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0677, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0539, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0530, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0573, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0414, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0465, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0650, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0519, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0691, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0541, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0445, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0438, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0785, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0442, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0386, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0668, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0549, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0474, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0405, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0690, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0479, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0469, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0457, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0455, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0527, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0542, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0741, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0379, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0728, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0445, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0389, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0435, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0508, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0587, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0836, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0440, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0469, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0594, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0412, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0443, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0449, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0415, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0506, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0521, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0603, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0726, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0533, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0698, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0460, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0447, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0416, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0502, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0968, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1014, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0404, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0424, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0408, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0403, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0468, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0576, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0457, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0458, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0601, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0613, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0494, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0413, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0648, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0429, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0515, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0543, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0411, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0580, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0484, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0752, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0440, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0538, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0589, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0658, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0616, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0572, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0417, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0717, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0447, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0578, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0718, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0403, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0818, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0468, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0572, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0485, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0384, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0402, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0633, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0389, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0477, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0498, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0429, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0560, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0621, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0621, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0604, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0576, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0544, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0637, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0407, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0406, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0518, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0507, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0431, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0490, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0550, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0737, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0417, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0520, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0523, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0566, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0401, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0608, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0600, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0426, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0635, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0435, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0485, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0540, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0545, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0767, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0646, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0678, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0408, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0426, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0827, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0390, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0508, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0425, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0402, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0487, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0620, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0405, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0404, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0596, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0390, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0483, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0555, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0433, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0690, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0834, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0624, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0455, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0421, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0408, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0587, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0389, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0516, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0649, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0572, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0522, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0670, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0449, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0445, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0447, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0712, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0426, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0493, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0746, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0404, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0705, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0461, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0434, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0472, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0439, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0485, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0495, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0397, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0502, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0804, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0383, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0580, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0570, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0416, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0449, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0670, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0571, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0537, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0578, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0731, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0426, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0527, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0415, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0620, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0443, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0604, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0790, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0501, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0564, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0549, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0432, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0644, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0524, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0522, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0620, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0379, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0383, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0673, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0420, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0390, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0806, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0400, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0695, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0447, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0593, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0424, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0469, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0468, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0504, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0741, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0449, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0403, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0613, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0436, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0507, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0662, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0580, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0649, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0583, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0437, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0449, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0667, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0647, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0413, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0384, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0517, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0528, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0559, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0460, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0440, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0429, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0407, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0590, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0559, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0466, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0532, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0532, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0420, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0595, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0448, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0401, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0390, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0773, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0541, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0530, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0392, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0670, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0914, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0384, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0503, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0556, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0456, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0597, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0561, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0650, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0465, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0537, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0446, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0875, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0618, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0483, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0569, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0747, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0495, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0508, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0384, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0554, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0399, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0527, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0464, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0521, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0675, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0403, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0382, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0398, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0532, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0647, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0412, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0425, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0397, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0484, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0474, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0450, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0635, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0455, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0459, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0405, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0456, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0697, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0524, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0497, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0615, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0823, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0564, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0735, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0464, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0570, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0474, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0430, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0506, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0394, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0574, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0405, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0506, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0422, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0474, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0790, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0548, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0575, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0402, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0405, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0493, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0451, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0581, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0402, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0448, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0783, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0547, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0485, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0639, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0750, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0602, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0479, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0608, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0680, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0620, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0543, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0656, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0643, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0471, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0462, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0502, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0650, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0450, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0693, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0687, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0593, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0474, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0442, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0410, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0514, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0689, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0395, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0533, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0439, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0581, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0622, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0629, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0413, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0513, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0542, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0395, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0428, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0460, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0499, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0503, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0567, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0413, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0388, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0424, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0513, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0563, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0405, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0873, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0861, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0441, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0392, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0475, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0654, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0697, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0457, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0607, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0651, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0562, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0789, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0506, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0442, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0536, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0446, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0432, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0476, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0619, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0436, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0388, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0521, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0567, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0408, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0681, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0557, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0460, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0444, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0540, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0593, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0522, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0438, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0404, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0428, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0676, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0452, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0545, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0913, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0548, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0413, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0579, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0453, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0515, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0461, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0553, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0571, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0498, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0661, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0431, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0511, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0615, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0487, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0442, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0513, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0745, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0743, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0507, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0542, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0512, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0428, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0575, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0444, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0511, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0383, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0399, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0412, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0674, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0437, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0741, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0719, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0396, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0648, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0405, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0459, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0500, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0442, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0517, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0638, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0542, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0423, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0525, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0523, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0533, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0448, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0734, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0463, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0512, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0495, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0671, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0530, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0517, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0452, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0408, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0510, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0506, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0528, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0413, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0477, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0465, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0409, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0880, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0533, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0805, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0471, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0823, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0434, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0575, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0411, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0404, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0491, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0479, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0410, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0525, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0492, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0507, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0572, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0458, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0609, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0671, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0513, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0398, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0901, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0650, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0461, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0669, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0648, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0576, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0704, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0461, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0503, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0535, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0651, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0586, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0420, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0536, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0523, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0411, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0436, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0419, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0426, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0417, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0669, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0437, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0431, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0578, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0488, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0384, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0804, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0441, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0524, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0722, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0493, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0600, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0493, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0438, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0382, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0406, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0531, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0613, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0398, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0581, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0439, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0418, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0484, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0504, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0712, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0594, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0424, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0483, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0631, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0500, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0434, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0456, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0842, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0397, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0448, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0651, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0746, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0787, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0389, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0446, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0693, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0473, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0478, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0432, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0547, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0597, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0391, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0683, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0521, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0519, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0478, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0581, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0419, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0690, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0668, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0392, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0414, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0445, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0634, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0477, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0526, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0489, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0556, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0396, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0401, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0497, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0513, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0472, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0526, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0418, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0461, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0392, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0489, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0491, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0612, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0386, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0703, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0507, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0384, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0477, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0499, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0796, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0444, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0479, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0605, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0430, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0929, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0838, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0507, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0678, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0403, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0382, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0673, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0553, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0657, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0658, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0477, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0607, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0484, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0545, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0386, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0520, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0488, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0696, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0515, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0529, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0712, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0414, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0464, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0487, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0479, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0535, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0630, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0564, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0399, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0483, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0672, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0771, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0482, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0746, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0501, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0731, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0421, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0669, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0454, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0421, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0601, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0600, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0539, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0386, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0441, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0416, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0571, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0418, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0397, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0448, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0382, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0534, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0585, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0453, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0558, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0727, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0477, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0435, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0860, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0601, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0568, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0570, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0481, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0545, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0540, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0442, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0415, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0721, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0686, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0379, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0467, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0501, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0520, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0477, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0396, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0432, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0623, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0394, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0680, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0490, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0418, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0408, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0503, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0415, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0496, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0387, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0561, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0448, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0436, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0720, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0502, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0687, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0508, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0430, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0454, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0484, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0439, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0484, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0476, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0409, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0432, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0663, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0432, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0807, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0557, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0611, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0509, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0415, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0406, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0796, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0492, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0555, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0423, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0572, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0702, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0467, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0406, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0657, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0481, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0652, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0607, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0566, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0483, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0760, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0566, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0576, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0466, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0707, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0530, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0539, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0514, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0600, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0436, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0445, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0557, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0633, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0704, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0451, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0392, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0527, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0509, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0605, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0413, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0596, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0560, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0774, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0465, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0565, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0756, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0622, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0399, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0384, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0575, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0410, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0412, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0560, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0525, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0822, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0627, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0532, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0579, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0493, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0460, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0462, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0389, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0484, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0434, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0553, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0491, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0402, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0524, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0639, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0490, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0463, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0509, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0579, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0485, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0434, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0553, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0499, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0394, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0555, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0424, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0463, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0556, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0699, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0687, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0399, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0391, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0390, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0715, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0532, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0598, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0421, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0578, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0409, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0850, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0392, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0533, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0456, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0771, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0423, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0602, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0481, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0552, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0467, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0478, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0454, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0527, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0517, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0431, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0596, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0464, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0525, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0418, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0579, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0447, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0402, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0643, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0618, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0387, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0403, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0786, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0647, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0414, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0398, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0648, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0533, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0649, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0597, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0383, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0432, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0578, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0474, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0499, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0710, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0465, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0403, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0430, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0501, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0521, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0605, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0438, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0594, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0469, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0440, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0387, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0394, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0528, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0630, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0472, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0478, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0488, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0505, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0500, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0435, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0600, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0617, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0382, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0591, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0553, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0444, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0787, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0446, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0430, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0460, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0548, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0462, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0533, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0927, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0464, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0421, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0528, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0444, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0401, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0508, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0437, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0501, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0537, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0447, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0569, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0545, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0386, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0512, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0779, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0502, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0502, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0613, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0744, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0436, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0461, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0410, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0686, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0441, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0435, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0440, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0582, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0632, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0628, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0436, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0452, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0447, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0415, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0485, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0721, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0407, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0433, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0412, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0631, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0403, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0786, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0734, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0485, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0475, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0625, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0379, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0578, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0385, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0577, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0446, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0429, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0449, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0774, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0485, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0481, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0401, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0617, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0531, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0595, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0626, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0398, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0388, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0461, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0416, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0500, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0625, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0394, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0618, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0556, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0549, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0477, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0465, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0407, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0677, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0469, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0530, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0446, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0530, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0424, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0596, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0485, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0672, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0562, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0646, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0577, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0556, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0432, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0385, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0394, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0414, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0482, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0622, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0589, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0573, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0519, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0543, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0469, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0420, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0441, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0695, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0486, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0427, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0416, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0397, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0502, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0495, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0647, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0419, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0412, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0599, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0425, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0467, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0519, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0692, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0418, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0568, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0443, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0508, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0388, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0503, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0395, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0513, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0616, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0388, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0539, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0434, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0402, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0680, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0652, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0576, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0447, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0669, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0432, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0509, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0474, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0437, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0430, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0601, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0632, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0391, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0797, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0521, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0474, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0406, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0656, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0525, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0410, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0656, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0686, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0493, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0417, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0396, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0383, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0485, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0556, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0469, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0502, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0586, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0642, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0556, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0397, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0405, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0495, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0629, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0389, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0560, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0576, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0465, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0514, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0512, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0728, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0570, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0406, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0545, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0452, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0536, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0495, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0386, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0492, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0649, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0608, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0568, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0477, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0414, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0459, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0384, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0697, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0567, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0474, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0501, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0752, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0455, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0379, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0484, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0444, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0458, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0391, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0651, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0542, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0667, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0679, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0460, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0467, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0475, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0403, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0498, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0461, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0389, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0749, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0409, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0536, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0653, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0422, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0464, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0665, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0680, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0442, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0423, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0914, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0453, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0439, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0650, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0384, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0722, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0500, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0407, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0508, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0518, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0694, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0558, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0466, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0974, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0428, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0743, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0436, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0471, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0613, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0422, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0516, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0456, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0518, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0589, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0606, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0477, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0481, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0624, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0627, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0395, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0473, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0726, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0731, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0438, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0396, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0429, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0410, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0542, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0415, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0858, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0557, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0641, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0385, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0631, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0526, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0539, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0398, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0658, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0621, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0415, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0454, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0424, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0757, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0664, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0391, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0413, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0485, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0498, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0649, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0516, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0446, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0419, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0552, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0527, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0488, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0386, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0461, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0579, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0970, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0388, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0594, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0379, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0419, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0438, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0387, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0443, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0520, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0615, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0498, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0542, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0874, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0633, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0489, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0739, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0407, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0454, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0394, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0877, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0459, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0761, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0700, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0406, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0587, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0497, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0638, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0758, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0483, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0511, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0383, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0477, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0391, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0754, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0498, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0416, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0529, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0477, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0636, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0462, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0507, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0463, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0441, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0963, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0382, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0395, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0506, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0502, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0592, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0424, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0575, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0452, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0583, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0443, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0728, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0441, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0498, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0402, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0391, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0450, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0404, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0445, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0458, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0586, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0509, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0572, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0414, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0413, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0505, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0518, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0635, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0473, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0572, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0478, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0478, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0471, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0464, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0949, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0453, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0554, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0501, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0463, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0484, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0557, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0412, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0556, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0411, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0434, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0528, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0751, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0896, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0479, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0466, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0682, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0528, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0794, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0514, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0414, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0461, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0646, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0398, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0438, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0441, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0598, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0636, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0401, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0544, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0557, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0599, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0392, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0523, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0585, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0436, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0385, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0588, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0430, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0448, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0391, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0585, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0408, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0452, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0585, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0566, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0394, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0473, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0599, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0559, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0589, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0531, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0558, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0402, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0439, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0553, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0580, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0551, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0462, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0839, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0519, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0420, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0444, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0459, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0657, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0379, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0533, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0568, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0465, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0848, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0393, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0422, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0505, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0505, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0579, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0442, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0659, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0401, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0396, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0579, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0425, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0643, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0463, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0569, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0605, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0692, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0644, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0417, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0427, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0550, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0434, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0409, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0479, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0530, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0674, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0453, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0729, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0747, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0409, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0491, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0540, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0398, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0463, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0818, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0384, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0501, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0467, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0459, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0438, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0437, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0632, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0417, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0478, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0502, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0505, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0402, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0626, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0606, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0420, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0427, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0422, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0556, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0505, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0727, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0631, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0757, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0642, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0496, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0401, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0518, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0757, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0408, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0553, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0480, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0540, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0384, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0450, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0573, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0458, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0780, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0516, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0504, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0568, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0424, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0447, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0385, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0523, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0577, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0558, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0663, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0412, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0569, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0546, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0515, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0426, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0665, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0447, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0478, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0523, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0501, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0472, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0422, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0515, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0655, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0520, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0452, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0434, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0631, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0623, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0437, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0750, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0540, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0499, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0446, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0410, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0498, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0446, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0584, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0438, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0723, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0433, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0574, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0425, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0721, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0717, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0615, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0517, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0576, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0383, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0415, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0887, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0424, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0946, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0639, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0565, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0454, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0573, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0454, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0550, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0427, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0528, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0514, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0606, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0403, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0571, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0638, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0499, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0667, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0417, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0451, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0462, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0702, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0593, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0530, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0394, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0471, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0556, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0513, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0446, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0495, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0742, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0557, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0538, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0802, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0580, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0412, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0672, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0638, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0679, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0675, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0456, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0590, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0579, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0590, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0718, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0450, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0683, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0569, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0442, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0419, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0958, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0906, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0823, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0458, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0452, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0543, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0599, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0400, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0455, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0643, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0506, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0392, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0629, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0575, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0405, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0662, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0514, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0469, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0388, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0434, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0518, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0474, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0493, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0395, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0544, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0550, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0397, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0612, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0688, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0719, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0454, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0569, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0480, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0385, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0626, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0396, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0643, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0499, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0518, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0428, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0785, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0468, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0414, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0471, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0449, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0528, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0795, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0451, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0423, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0433, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0424, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0505, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0477, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0999, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0743, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0502, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0788, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0762, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0523, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0524, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0493, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0485, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0493, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0525, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0760, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0505, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0474, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0449, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0758, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0444, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0543, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0727, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0438, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0657, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0488, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0664, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0409, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0410, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0466, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0611, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0538, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0389, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0656, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0401, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0529, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0607, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0439, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0395, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0418, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0535, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0442, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0457, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0478, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0501, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0483, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0469, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0554, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0537, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0931, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0565, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0514, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0682, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0777, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0624, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0756, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0741, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0438, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0492, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0510, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0515, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0411, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0612, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0441, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0407, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0524, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0420, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0865, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0505, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0680, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0456, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0543, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0447, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0602, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0455, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0451, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0491, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0472, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0521, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0473, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0512, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0610, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0505, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0415, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0410, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0684, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0656, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0512, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0383, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0530, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0586, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0411, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0423, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0616, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0466, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0563, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0606, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0386, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0476, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0486, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0677, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0435, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0440, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0392, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0389, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0781, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0528, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0653, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0393, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1023, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0523, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0394, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0406, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0649, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0660, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0422, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0722, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0613, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0525, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0564, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0526, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0529, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0541, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0699, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0480, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0397, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0801, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0603, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0519, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0611, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0832, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0407, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0458, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0695, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0395, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0426, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0435, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0394, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0723, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0489, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0478, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0417, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0398, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0549, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0450, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0649, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0476, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0473, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0482, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0709, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0836, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0584, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0434, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0401, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0392, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0996, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0553, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0647, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0434, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0613, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0489, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0634, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0700, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0390, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0390, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0670, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0410, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0580, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0411, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0400, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0614, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0477, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0496, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0416, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0435, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0383, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0379, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0587, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0387, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0609, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0474, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0613, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0453, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0672, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0521, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0411, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0583, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0492, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0503, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0420, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0448, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0882, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0633, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0397, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0438, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0579, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0502, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0620, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0698, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0446, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0608, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0503, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0474, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0565, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0565, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0619, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0601, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0648, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0574, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0467, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0379, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0657, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0592, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0568, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0513, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0817, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0511, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0844, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0471, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0450, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0398, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0401, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0500, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0497, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0472, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0862, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0447, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0521, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0764, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0558, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0466, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0888, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0582, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0387, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0531, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0661, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0507, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0422, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0589, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0442, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0575, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0469, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0468, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0510, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0522, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0506, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0457, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0388, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0517, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0396, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0408, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0416, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0518, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0607, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0482, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0646, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0714, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0436, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0414, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0424, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0422, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0384, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0468, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0432, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0633, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0618, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0524, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0389, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0549, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0496, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0412, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0599, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0456, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0408, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0487, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0504, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0451, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0407, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0680, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0845, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0658, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0714, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0544, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0718, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0450, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0456, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0402, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0420, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0455, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0398, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0603, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0414, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0449, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0413, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0457, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0663, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0382, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0421, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0453, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0716, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0639, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0391, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0403, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0382, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0536, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0535, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0499, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0408, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0433, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0474, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0454, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0398, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0435, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0621, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0607, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0389, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0471, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0443, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0505, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0391, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0485, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0778, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0525, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0591, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0480, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0814, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0821, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0657, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0415, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0540, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0477, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0630, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0417, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0529, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0494, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0438, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0452, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0534, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0429, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0483, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0448, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0492, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0411, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0518, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0499, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0392, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0398, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0488, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0527, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0568, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0482, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0810, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0558, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0581, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0422, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0408, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0405, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0501, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0428, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0445, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0590, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0423, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0425, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0701, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0510, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0431, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0928, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0444, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0491, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0384, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0729, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0412, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0475, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0421, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0449, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0499, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0700, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0501, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0432, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0411, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0492, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0464, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0425, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0453, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0390, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0682, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0539, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0507, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0765, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0398, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0511, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0401, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0875, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0430, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0792, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0518, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0492, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0542, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0789, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0400, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0650, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0818, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0661, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0620, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0531, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0462, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0579, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0657, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0503, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0388, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0468, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0596, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0461, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0560, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0490, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0402, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0770, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0552, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0523, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0435, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0021, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0487, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0410, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0406, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0679, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0508, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0404, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0385, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0447, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0467, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0446, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0552, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0467, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0810, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0385, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0742, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0778, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0470, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0542, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0631, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0464, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0526, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0460, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0555, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0522, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0779, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0526, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0424, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0458, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0385, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0404, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0521, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0468, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0479, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0718, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0596, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0457, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0519, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0448, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0704, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0512, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0392, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0515, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0646, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0482, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0398, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0413, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0663, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0689, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0805, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0399, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0417, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0467, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0471, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0429, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0484, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0379, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0548, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0629, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0417, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0382, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0433, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0710, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0397, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0384, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0454, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0712, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0481, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0550, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0552, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0611, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0436, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0416, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0556, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0577, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0456, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0512, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0420, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0417, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0524, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0530, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0434, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0523, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0586, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0394, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0481, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0413, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0460, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0402, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0396, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0548, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0623, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0554, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0480, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0491, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0459, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0649, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0450, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0456, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0437, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0401, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0650, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0433, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0582, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0772, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0483, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0440, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0634, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0402, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0588, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0392, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0767, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0469, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0411, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0521, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0659, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0420, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0592, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0422, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0462, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0603, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0459, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0506, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0393, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0439, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0479, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0448, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0565, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0447, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0440, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0556, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0473, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0405, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0746, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0478, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0544, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0404, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0394, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0502, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0582, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0579, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0609, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0430, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0691, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0433, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0454, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0440, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0592, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0489, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0542, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0448, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0389, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0478, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0496, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0405, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0778, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0759, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0590, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0517, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0432, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0661, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0852, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0430, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0396, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0442, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0883, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0388, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0425, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0434, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0402, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0634, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0647, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0509, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0576, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0413, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0478, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0512, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0447, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0677, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0389, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0402, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0541, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0602, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0489, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0405, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0443, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0489, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0462, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0400, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0379, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0398, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0450, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0667, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0570, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0782, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0403, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0417, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0517, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0818, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0574, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0581, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0562, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0458, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0526, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0462, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0695, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0575, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0628, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0453, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0445, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0468, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0485, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0621, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0419, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0503, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0666, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0453, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0798, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0480, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0600, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0437, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0390, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0568, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0393, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0423, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0551, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0439, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0457, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0577, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0712, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0499, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0732, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0445, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0528, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0490, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0395, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0432, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0572, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0445, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0814, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0391, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0486, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0383, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0539, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0387, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0831, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0457, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0449, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0676, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0389, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0767, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0435, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0395, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0418, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0382, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0691, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0730, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0391, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0455, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0409, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0867, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0504, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0435, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0447, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0430, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0379, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0525, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0669, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0416, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0468, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0535, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0443, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0528, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0502, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0937, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0521, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0383, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0420, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0741, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0615, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0650, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0810, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0393, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0412, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0427, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0438, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0603, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0541, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0397, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0398, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0543, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0403, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0605, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0442, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0467, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0498, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0541, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0436, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0401, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0711, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0465, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0393, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0637, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0408, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0496, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0472, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0432, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0404, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0521, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0551, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0385, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0408, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0423, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0437, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0389, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0386, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0503, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0450, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0843, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0528, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0462, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0396, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0649, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0573, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0393, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0715, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0474, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0451, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0467, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0726, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0488, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0422, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0417, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0487, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0379, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0399, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0524, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0692, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0648, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0470, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0401, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0390, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0384, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0402, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0422, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0586, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0390, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0528, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0416, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0571, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0413, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0394, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0513, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0551, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0400, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0574, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0573, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0404, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0459, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0436, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0414, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0441, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0440, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0472, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0398, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0560, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0444, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0564, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0682, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0462, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0416, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0445, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0519, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0645, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0404, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0408, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0451, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0474, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0395, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0514, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0396, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0398, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0534, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0443, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0484, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0444, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0448, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0499, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0575, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0407, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0598, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0465, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0435, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0462, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0455, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0403, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0747, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0423, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0508, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0523, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0419, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0601, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0477, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0403, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0457, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0711, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0408, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0520, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0449, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0425, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0482, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0533, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0514, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0673, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0391, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0603, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0444, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0530, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0498, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0400, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0413, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0383, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0500, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0515, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0407, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0427, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0533, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0383, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0402, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0456, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0478, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0483, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0408, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0419, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0420, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0532, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0434, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0448, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0395, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0525, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0488, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0571, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0462, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0469, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0392, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0568, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0442, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0635, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0422, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0387, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0537, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0408, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0426, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0416, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0408, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0390, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0588, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0466, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0414, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0413, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0568, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0541, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0385, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0435, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0477, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0388, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0630, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0388, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0628, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0539, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0419, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0424, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0458, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0385, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0475, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0436, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0499, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0665, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0388, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0507, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0474, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0419, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0423, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0605, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0655, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0412, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0401, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0589, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0472, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0395, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0406, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0389, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0488, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0406, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0521, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0476, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0462, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0404, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0392, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0390, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0516, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0457, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0391, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0379, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0547, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0416, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0413, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0460, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0420, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0725, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0421, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0465, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0443, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0481, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0433, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0509, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0414, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0599, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0568, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0430, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0387, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0387, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0398, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0446, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0469, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0691, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0431, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0514, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0513, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0694, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0441, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0504, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0483, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0412, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0443, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0556, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0444, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0575, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0512, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0412, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0484, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0389, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0597, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0571, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0430, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0504, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0535, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0586, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0386, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0424, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0394, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0519, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0469, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0431, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0568, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0446, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0607, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0422, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0392, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0490, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0409, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0449, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0414, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0415, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0730, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0401, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0408, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0594, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0474, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0388, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0440, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0379, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0471, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0401, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0631, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0398, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0764, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0780, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0492, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0389, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0391, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0511, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0470, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0503, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0430, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0472, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0654, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0567, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0593, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0465, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0507, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0392, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0435, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0489, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0476, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0386, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0441, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0511, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0561, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0411, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0397, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0428, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0425, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0614, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0417, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0511, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0452, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0519, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0541, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0469, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0416, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0421, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0409, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0419, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0414, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0401, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0502, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0501, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0551, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0461, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0399, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0428, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0742, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0577, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0436, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0511, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0442, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0435, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0422, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0457, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0412, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0405, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0419, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0388, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0438, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0411, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0456, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0568, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0421, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0634, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0449, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0394, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0440, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0402, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0491, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0421, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0622, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0528, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0663, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0469, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0396, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0469, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0578, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0473, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0553, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0421, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0463, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0405, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0471, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0405, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0511, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0545, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0465, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0439, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0394, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0424, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0414, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0394, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0399, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0388, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0412, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0505, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0453, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0448, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0455, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0509, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0390, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0668, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0431, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0459, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0682, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0567, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0569, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0558, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0468, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0894, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0400, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0452, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0499, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0391, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0417, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0385, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0439, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0414, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0414, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0411, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0410, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0548, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0713, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0385, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0379, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0398, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0559, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0382, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0410, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0472, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0397, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0474, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0490, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0696, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0386, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0394, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0417, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0449, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0434, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0392, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0408, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0426, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0602, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0479, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0499, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0390, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0484, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0021, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0416, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0450, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0662, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0593, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0391, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0476, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0468, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0544, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0497, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0435, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0402, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0496, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0382, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0410, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0382, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0546, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0443, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0422, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0456, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0394, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0401, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0405, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0384, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0518, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0022, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0690, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0439, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0395, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0459, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0530, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0417, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0497, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0535, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0468, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0388, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0020, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0386, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0522, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0455, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0593, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0563, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0397, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0436, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0575, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0515, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0428, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0496, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0383, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0385, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0393, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0418, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0470, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0460, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0407, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0591, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0454, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0379, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0387, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0434, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0414, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0541, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0419, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0395, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0445, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0421, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0470, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0478, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0389, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0625, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0433, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0395, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0458, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0584, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0426, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0406, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0398, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0485, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0405, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0386, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0424, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0472, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0461, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0390, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0717, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0457, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0492, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0397, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0506, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0394, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0400, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0013, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0427, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0412, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0390, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0410, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0410, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0453, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0530, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0451, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0590, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0469, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0408, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0453, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0443, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0453, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0425, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0493, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0519, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0386, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0496, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0441, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0474, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0605, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0761, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0024, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0445, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0426, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0421, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0379, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0386, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0386, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0442, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0443, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0563, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0466, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0410, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0526, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0444, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0636, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0930, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0390, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0382, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0470, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0545, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0487, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0390, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0414, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0388, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0011, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0540, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0393, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0429, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0403, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0476, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0457, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0402, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0404, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0426, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0443, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0836, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0489, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0537, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0536, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0411, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0760, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0022, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0432, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0417, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0384, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0468, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0462, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0413, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0394, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0540, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0419, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0397, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0002, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0011, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0419, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0410, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0583, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0468, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0023, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0452, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0399, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0434, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0404, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0386, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0403, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0583, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0579, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0409, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0382, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0390, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0586, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0770, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0564, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0440, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0405, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0427, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0428, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0399, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0010, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0439, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0492, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0394, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0415, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0402, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0479, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0386, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0390, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0412, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0511, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0465, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0460, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0479, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0455, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0423, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0649, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0521, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0404, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0434, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0393, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0013, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0379, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0013, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0390, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0385, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0496, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0424, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0681, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0676, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0441, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0391, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0383, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0445, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0473, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0484, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0400, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0388, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0426, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0798, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0387, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0405, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0400, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0431, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0398, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0593, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0403, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0489, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0429, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0457, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0406, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0479, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0386, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0397, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0476, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0418, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0517, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0496, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0504, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0446, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0400, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0497, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0386, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0386, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0391, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0419, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0550, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0441, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0518, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0431, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0392, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0416, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0471, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0021, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0025, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0479, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0007, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0384, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0415, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0620, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0388, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0504, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0459, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0484, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0481, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0387, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0387, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0392, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0014, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0431, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0396, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0383, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0536, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0019, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0449, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0406, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0411, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0412, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0433, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0540, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0007, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0452, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0427, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0471, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0386, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0425, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0390, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0015, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0476, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0023, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0495, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0417, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0394, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0382, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0406, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0404, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0443, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0447, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0427, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0616, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0396, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0545, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0382, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0415, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0600, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0441, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0401, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0006, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0018, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0447, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0420, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0508, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0401, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0397, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(2.8989e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0387, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0495, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0470, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0498, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0449, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0412, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0415, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0409, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0018, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0403, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0534, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0438, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0478, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0425, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0468, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0430, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0441, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0477, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0018, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0389, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0416, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0650, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0384, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0498, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0420, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0424, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0411, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0397, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0024, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0504, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0430, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0395, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0591, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0600, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0453, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0389, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0501, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0441, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0507, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0476, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0421, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0424, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0382, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0392, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0025, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0422, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0382, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0405, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0384, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0403, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0629, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0416, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0474, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0409, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0507, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0455, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0517, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0385, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0424, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0431, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0390, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0505, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0021, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0432, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0382, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0486, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0452, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0429, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0379, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0466, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0439, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0452, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0010, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0019, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0392, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0023, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0421, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0473, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0018, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0410, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0503, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0382, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0498, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0471, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0465, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0457, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0393, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0420, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0392, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0400, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0023, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0379, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0398, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0008, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0408, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0503, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0388, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0019, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0415, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0456, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0616, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0438, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0014, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0580, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0403, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0616, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0467, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0011, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0552, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0462, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0489, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0442, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0466, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0007, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0414, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0431, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0398, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0395, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0021, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0417, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0506, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0486, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0593, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0513, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0417, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0467, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0467, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0400, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0495, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0019, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0392, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0403, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0510, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0398, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0441, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0422, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0420, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0451, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0533, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0456, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0405, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0023, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0411, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0385, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0420, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0406, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0384, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0460, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0442, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0397, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0417, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0385, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0390, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0451, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0408, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0489, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0485, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0024, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0406, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0535, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0014, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0548, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0408, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0403, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0450, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0402, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0409, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0540, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0388, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0386, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0450, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0402, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0411, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0385, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0388, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0432, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0389, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0776, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0404, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0431, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0448, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0401, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0438, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0435, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0388, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0472, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0405, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0579, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0418, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0420, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0016, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0007, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0448, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0392, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0448, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0455, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0463, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0527, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0429, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0507, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0503, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0705, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0457, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0524, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0019, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0458, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0460, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0022, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0414, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0485, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0024, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0644, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0743, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0495, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0392, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0598, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0459, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0681, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0446, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0389, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0441, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0023, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0782, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0018, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0406, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0559, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0848, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0486, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0013, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0726, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0383, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0453, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0797, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0022, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0471, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0025, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0382, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0409, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0385, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0487, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0401, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0480, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0430, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0501, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0485, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0483, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0004, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0462, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0398, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0416, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0005, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0462, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0451, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0472, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0589, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0661, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0481, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0417, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0404, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0558, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0502, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0703, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0385, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0385, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0505, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0441, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0015, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0544, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0415, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0450, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0449, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0535, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0021, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0791, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0528, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0409, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0606, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0422, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0404, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0416, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0400, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0399, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0511, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0008, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0023, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0664, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0497, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0428, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0429, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0422, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0702, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0501, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0445, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0008, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0415, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0480, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0502, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0379, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0017, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0416, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0022, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0496, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0414, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0393, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0386, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0404, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0024, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0434, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0578, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0010, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0619, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0488, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0535, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0518, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0442, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0595, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0393, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0379, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0673, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0459, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0021, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0389, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0533, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0600, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0019, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0452, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0429, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0447, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0530, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0409, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0477, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0681, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0421, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0399, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0457, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0391, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0466, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0438, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0600, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0551, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0515, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0534, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0025, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0593, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0438, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0657, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0433, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0414, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0386, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0482, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0431, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0024, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0477, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0451, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0667, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0427, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0457, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0422, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0439, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0461, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0458, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0669, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0501, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0402, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0748, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0016, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0477, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0400, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0434, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0025, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0535, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0389, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0010, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0460, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0429, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0461, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0020, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0379, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0427, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0025, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0435, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0017, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0536, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0475, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0482, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0025, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0391, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0452, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0443, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0439, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0637, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0025, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0486, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0581, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0427, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0019, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0433, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0449, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0623, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0023, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0471, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0025, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0452, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0431, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0402, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0442, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0471, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0389, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0420, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0412, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0416, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0667, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0422, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0396, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0023, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0692, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0717, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0491, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0384, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0009, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0589, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0583, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0572, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0655, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0421, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0565, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0424, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0670, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0796, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0555, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0479, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0423, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0542, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0385, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0386, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0630, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0603, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0388, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0397, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0025, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0441, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0480, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0496, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0508, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0437, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0425, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0017, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0399, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0421, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0022, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0468, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0411, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0474, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0018, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0457, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0429, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0008, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0603, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0465, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0017, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0546, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0669, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0586, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0384, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0434, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0382, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0458, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0387, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0406, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0433, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0590, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0007, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0692, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0448, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0428, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0426, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0620, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0403, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0487, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0401, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0022, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0465, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0423, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0449, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0517, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0423, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0664, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0442, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0417, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0455, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0024, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0024, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0440, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0439, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0571, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0021, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0402, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0398, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0607, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0588, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0465, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0464, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0412, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0418, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0024, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0734, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0382, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0483, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0527, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0461, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0465, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0421, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0400, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0007, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0465, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0390, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0401, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0422, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0444, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0388, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0397, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0022, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0501, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0454, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0009, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0495, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0025, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0397, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0650, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0401, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0416, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0399, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0399, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0403, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0025, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0385, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0585, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0399, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0400, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0019, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0389, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0414, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0401, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0415, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0508, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0467, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0526, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0411, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0394, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0400, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0019, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0025, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0597, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0430, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0420, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0395, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0479, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0428, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0017, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0408, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0018, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0455, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0017, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0402, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0502, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0427, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0010, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0437, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0388, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0010, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0435, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0470, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0622, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0477, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0554, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0462, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0391, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0449, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0385, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0527, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0468, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0482, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0458, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0395, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0439, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0481, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0434, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0020, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0414, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0438, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0401, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0409, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0574, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0396, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0396, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0567, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0662, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0527, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0420, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0458, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0503, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0386, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0442, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0394, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0543, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0453, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0391, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0442, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0455, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0527, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0453, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0577, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0016, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0384, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0534, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0581, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0536, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0534, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0434, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0410, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0450, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0625, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0444, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0465, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0530, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0389, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0471, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0396, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0526, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0553, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0634, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0402, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0476, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0452, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0418, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0413, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0447, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0518, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0384, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0410, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0449, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0394, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0423, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0022, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0619, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0399, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0440, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0546, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0645, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0019, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0467, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0483, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0386, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0457, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0399, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0402, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0572, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0580, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0534, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0446, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0421, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0557, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0466, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0391, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0382, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0395, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0484, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0499, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0379, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0396, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0389, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0393, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0435, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0426, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0512, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0424, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0403, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0487, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0382, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0394, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0420, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0448, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0436, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0485, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0419, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0458, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0412, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0403, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0579, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0383, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0419, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0676, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0389, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0416, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0385, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0388, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0384, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0515, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0018, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0384, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0392, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0482, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0448, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0445, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0450, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0483, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0430, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0438, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0402, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0456, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0881, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0531, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0382, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0397, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0434, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0382, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0398, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0413, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0430, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0536, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0447, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0391, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0397, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0399, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0432, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0488, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0421, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0472, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0400, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0439, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0017, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0385, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0452, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0457, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0025, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0459, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0451, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0416, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0485, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0443, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0432, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0387, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0513, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0383, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0425, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0402, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0427, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0398, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0410, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0015, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0405, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0444, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0418, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0382, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0387, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0514, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0389, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0461, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0420, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0406, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0395, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0405, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0435, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0416, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0560, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0404, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0591, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0446, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0461, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0488, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0477, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0499, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0393, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0397, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0014, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0393, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0431, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0400, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0012, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0440, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0678, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0460, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0416, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0435, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0440, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0633, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0482, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0585, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0524, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0492, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0389, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0397, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0460, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0524, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0435, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0402, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0393, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0416, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0461, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0475, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0391, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0499, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0539, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0522, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0429, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0441, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0541, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0522, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0409, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0396, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0431, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0386, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0402, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0523, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0449, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0418, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0475, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0395, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0421, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0448, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0482, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0436, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0402, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0455, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0405, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0023, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0442, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0411, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0561, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0424, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0445, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0419, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0397, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0468, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0398, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0445, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0451, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0425, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0484, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0486, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0861, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0515, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0542, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0407, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0491, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0465, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0445, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0399, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0468, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0390, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0545, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0379, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0432, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0434, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0569, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0387, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0456, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0391, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0429, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0465, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0506, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0021, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0401, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0409, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0429, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0475, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0019, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0456, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0498, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0455, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0480, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0390, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0402, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0019, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0429, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0536, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0613, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0435, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0449, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0379, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0558, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0438, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0399, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0476, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0592, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0409, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0384, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0436, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0450, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0520, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0379, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0393, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0010, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0393, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0422, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0452, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0379, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0406, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0399, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0440, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0463, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0465, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0470, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0406, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0487, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0518, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0022, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0467, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0023, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0524, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0749, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0396, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0416, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0433, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0430, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0437, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0542, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0584, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0460, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0552, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0391, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0379, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0417, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0410, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0396, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0386, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0403, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0389, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0406, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0544, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0484, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0850, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0440, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0410, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0564, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0415, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0474, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0423, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0416, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0640, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0443, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0495, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0512, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0409, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0404, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0379, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0389, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0475, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0417, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0516, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0412, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0025, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0410, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0460, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0386, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0391, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0648, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0419, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0417, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0495, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0531, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0456, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0386, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0396, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0447, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0427, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0424, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0385, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0415, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0021, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0447, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0489, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0489, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0419, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0382, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0443, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0379, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0432, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0023, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0395, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0590, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0392, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0404, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0420, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0503, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0025, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0494, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0799, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0401, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0409, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0484, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0018, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0007, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0012, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0394, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0483, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0485, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0400, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0479, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0439, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0437, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0517, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0404, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0432, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0397, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0526, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0452, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0425, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0484, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0471, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0391, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0460, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0405, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0394, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0379, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0479, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0420, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0018, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0443, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0013, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0434, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0025, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0531, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0482, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0418, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0593, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0391, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0015, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0484, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0478, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0412, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0005, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0462, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0415, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0395, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0010, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0484, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0459, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0021, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(2.3733e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0007, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0495, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0438, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0402, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0482, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0434, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0393, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0022, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0393, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0015, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0447, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0021, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0427, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0492, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0450, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0546, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0422, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0393, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0020, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0442, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(2.5271e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(2.5685e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0003, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0007, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0416, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0496, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0477, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0551, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0021, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0451, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0464, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0539, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0420, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0008, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0017, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0023, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0398, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0515, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(2.2661e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0020, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0447, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0007, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(2.8968e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0010, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0011, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0396, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0017, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0645, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0526, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0388, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0437, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0459, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0456, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0451, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0015, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0018, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(2.4214e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0535, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0016, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0441, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(2.4947e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0472, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0419, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0490, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(2.3371e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0016, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0009, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0724, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0400, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0431, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0024, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0005, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0491, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0626, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0400, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0434, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0445, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0007, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0397, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0426, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(2.7592e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(2.4788e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0022, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0407, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0018, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0020, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0404, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0013, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0483, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0012, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(2.6482e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0016, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(2.6278e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(3.3950e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0020, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0003, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0386, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0451, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(2.3712e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(2.2957e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0007, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0398, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0019, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0514, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0475, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0425, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0489, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0024, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0537, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0530, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0022, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(2.4420e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0010, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(2.3998e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0003, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(2.4541e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0428, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0010, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0017, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0640, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0009, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0523, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0025, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0384, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0428, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(2.9270e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0018, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0530, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0002, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(2.4137e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(2.4826e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0008, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0024, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0002, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(2.3616e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0001, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(2.6409e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0025, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0011, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0003, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0614, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0396, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0025, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0400, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0425, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0561, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0386, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0413, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0434, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0400, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0913, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0470, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0501, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0417, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0419, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0383, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0015, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0011, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(2.4656e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0010, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0014, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0012, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(2.4128e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(2.3343e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0482, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0470, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0590, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0467, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0502, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0429, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0002, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0442, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0418, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0002, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0020, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0410, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0024, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0443, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0416, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0450, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0448, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0012, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0021, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0384, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0424, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0448, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0024, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0402, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0562, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0391, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0542, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0669, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0507, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0025, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0387, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0023, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0540, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0426, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0395, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0019, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0444, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0476, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0479, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0408, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0410, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0004, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0468, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0420, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0435, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0498, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0010, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0450, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0388, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0399, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0474, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0862, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0417, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0420, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0384, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0573, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0445, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0426, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0021, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0015, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0015, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0014, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0576, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0448, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0497, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0482, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0520, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0582, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0448, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0421, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0427, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0469, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0382, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0480, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0471, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0399, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0491, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0384, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0429, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0405, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0434, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0438, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0675, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0382, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0510, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0489, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0496, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0379, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0015, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0411, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0019, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0489, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0567, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0472, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0443, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0017, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0013, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0485, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0522, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0509, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0881, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0523, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0431, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0395, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0453, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0467, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0024, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0451, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0410, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0450, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0420, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0424, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0468, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0564, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0397, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0543, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0396, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0491, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0010, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0385, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0486, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0021, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(6.8160e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0004, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0673, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0568, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0401, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0385, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0401, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0481, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0457, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0011, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0022, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0014, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0009, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0439, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0012, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0471, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0491, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0456, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0619, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0581, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0487, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0411, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0394, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0018, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0382, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0468, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0422, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0413, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0508, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0006, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0008, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0024, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0400, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(2.9123e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0018, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0388, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0469, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0385, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0964, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0628, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0016, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0440, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0404, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0433, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0460, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0922, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0597, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0712, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0391, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0780, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0543, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0396, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0016, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0005, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0010, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0400, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0406, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0019, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0400, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0437, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0577, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0422, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0567, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0618, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0014, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0022, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0441, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0424, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(6.6136e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0024, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0419, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0005, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(1.9917e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(1.8017e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0013, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0011, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0003, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0021, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0600, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0487, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0018, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0585, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0488, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0003, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0014, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0016, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0408, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0561, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0389, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0450, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0012, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0496, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0466, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0431, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0476, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0469, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0382, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0396, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0016, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0025, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(2.1444e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0573, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0595, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0024, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0401, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0442, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0020, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0406, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0495, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0403, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0417, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0407, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0399, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0398, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0398, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0388, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0384, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0487, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0383, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0021, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0006, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0012, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0474, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(2.2011e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0011, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0458, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0015, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0455, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0451, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0416, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0692, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0578, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0583, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0499, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0589, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0467, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0419, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0754, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0430, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0418, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0385, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0392, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0482, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0401, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0394, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0024, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0022, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0486, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0559, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0383, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0547, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0458, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0390, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0409, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0483, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0390, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0382, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0526, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0429, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0417, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0577, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0383, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0600, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0023, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0018, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0578, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0416, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0450, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0013, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0801, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0417, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0025, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0384, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0514, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0648, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0447, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0495, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0453, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0379, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0450, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0408, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0860, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0438, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0399, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0391, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0387, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(4.5678e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0440, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0399, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0657, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0457, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0466, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(6.1087e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0401, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0803, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0475, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0397, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0459, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0386, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0399, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0422, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0403, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0457, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(3.5753e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0447, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0756, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0411, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0014, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0545, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0442, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(7.0237e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0004, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0007, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0623, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0440, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0025, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0396, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0394, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0461, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0422, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0497, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0379, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0562, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0379, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0453, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0523, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0428, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0425, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0414, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0394, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0023, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0531, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0019, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0409, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0402, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0588, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0410, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0441, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0485, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0014, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0409, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0024, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0399, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0512, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0385, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0385, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0536, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0498, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0016, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0006, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0418, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0452, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0011, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(3.6097e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0429, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0384, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0018, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0006, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0817, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0398, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0542, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0420, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0427, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0408, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0673, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0024, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0822, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0465, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0012, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0466, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0002, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0013, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0642, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0426, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0393, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0409, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0404, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0466, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0393, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0015, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0012, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0012, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0509, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0482, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0575, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0396, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0507, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0386, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0393, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(3.5307e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(8.5066e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0448, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0459, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0535, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0653, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0456, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0382, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0520, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0025, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0533, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0012, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0015, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0434, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0386, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0449, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0384, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0445, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0404, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0422, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0501, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0414, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0618, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0015, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0484, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0412, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0404, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0483, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0407, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0461, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0379, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0409, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0424, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0453, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0583, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0407, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0004, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0390, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0453, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0548, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0416, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0532, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0428, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0416, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0743, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0537, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0510, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0529, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0022, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0020, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0532, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0507, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0478, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0625, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0413, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0458, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0571, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0425, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0382, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0532, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0401, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0468, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0582, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0396, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0478, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0570, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0415, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0482, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0016, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0611, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0444, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0413, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0394, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0478, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0575, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0539, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0505, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0504, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0579, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0441, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0555, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0444, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0408, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0662, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0485, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0417, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0529, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0404, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0406, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0415, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0015, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0408, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0487, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0537, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0424, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0397, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0393, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0672, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0433, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0466, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0424, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0508, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0519, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0958, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0016, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0856, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0459, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0410, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0398, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0400, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0428, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0507, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0017, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0025, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0472, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0462, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0569, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0602, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0794, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0454, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0022, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0024, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0499, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0464, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0424, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0559, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0410, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0018, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0458, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0527, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0659, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0013, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0584, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0556, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0564, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0386, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0716, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0673, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0019, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0572, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0880, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0436, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0416, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0467, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0398, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0673, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0414, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0396, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0412, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0417, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0431, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0425, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0379, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0442, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0412, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0388, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0533, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0405, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0500, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0408, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0503, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0006, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0604, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0401, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0470, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0474, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0409, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0523, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0002, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0518, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0659, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0431, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0593, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0568, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0446, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0455, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0410, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0521, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0010, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0425, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0024, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0517, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0501, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0479, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0386, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0434, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0526, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0389, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0450, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0024, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0469, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0500, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0521, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0438, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0455, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0483, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0024, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0477, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0383, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0481, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0485, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0383, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0556, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0394, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0676, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0497, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0405, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0528, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0432, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0416, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0386, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0858, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0014, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0446, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0392, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0566, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0383, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0556, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0465, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0675, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0395, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0407, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0569, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0423, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0406, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0452, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0501, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0430, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0580, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0390, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0422, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0405, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0379, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0433, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0463, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0526, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0463, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0483, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0441, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0423, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0485, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0392, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0503, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0507, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0421, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0412, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0018, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0019, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0512, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0412, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0012, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0425, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0427, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0390, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0417, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0424, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0539, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0421, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0393, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0394, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0382, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0447, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0421, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0577, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0575, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0543, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0396, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0385, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0489, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0451, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0439, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0427, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0465, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0421, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0392, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0456, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0451, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0439, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0464, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0459, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0411, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0518, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0740, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0551, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0471, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0419, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0424, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0479, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0496, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0410, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0444, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0410, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0432, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0451, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0496, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0471, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0486, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0556, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0630, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0459, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0389, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0575, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0392, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0441, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0402, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0009, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0457, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0656, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0384, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0438, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0519, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0581, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0446, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0547, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0552, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0408, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0383, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0424, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0411, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0430, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0420, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0487, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0018, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0413, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0415, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0404, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0393, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0427, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0587, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0412, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0411, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0441, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0390, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0383, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0458, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0400, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0455, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0383, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0387, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0387, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0386, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0448, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0420, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0400, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0439, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0020, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0419, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0387, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0424, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0406, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0395, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0025, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0889, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0407, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0419, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0405, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0488, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0016, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0389, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0449, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0382, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0498, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0382, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0018, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0418, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0424, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0423, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0510, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0600, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0387, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0396, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0455, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0559, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0392, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0018, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0489, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0407, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0425, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0447, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0748, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0392, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0404, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0637, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0434, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0588, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0407, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0398, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0023, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0394, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0500, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0625, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0392, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0410, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0403, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0545, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0389, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0395, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0678, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0580, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0409, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0003, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0379, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0388, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0537, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0607, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0397, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0991, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0469, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0486, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0507, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0391, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0629, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0413, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0007, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0421, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0415, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0639, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0426, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0478, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0424, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0397, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0467, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0396, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0399, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0444, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0659, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0393, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0428, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0583, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0492, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0599, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0402, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0439, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0382, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0478, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0018, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0399, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0484, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0505, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0021, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0538, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0404, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0546, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0401, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0403, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0451, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0479, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0423, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0620, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0572, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0407, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0486, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0515, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0592, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0389, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0404, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0411, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0553, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0016, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0404, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0522, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0547, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0012, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0456, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0414, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0011, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0389, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0510, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0405, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0025, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0404, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0401, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0609, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0021, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0015, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0620, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0398, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0385, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0401, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0424, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0441, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0408, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0465, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0382, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0429, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0452, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0016, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0384, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0445, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0017, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0405, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0411, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0624, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0390, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0444, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0486, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0437, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0451, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0762, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0399, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0575, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0578, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0384, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0494, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0003, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0763, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0402, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0472, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0384, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0450, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0483, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0024, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0512, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0386, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0011, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0444, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0404, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0422, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0413, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0405, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0450, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0471, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0428, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0534, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0390, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0394, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0486, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0384, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0397, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0450, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0473, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0406, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0447, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0385, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0626, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0401, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0424, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0011, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0425, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0484, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0388, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0495, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0020, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0420, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0491, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0007, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0017, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0466, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0443, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0424, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0433, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0471, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0397, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0385, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0443, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0464, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0724, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0474, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0855, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0428, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0551, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0387, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0709, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0418, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0395, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0393, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0401, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0518, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0421, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0444, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0607, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0552, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0398, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0555, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0510, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0392, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0439, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0418, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0404, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0512, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0390, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0486, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0533, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0388, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0436, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0534, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0383, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0408, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0383, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0379, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0422, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0398, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0420, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0552, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0481, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0554, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0393, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0392, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0399, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0431, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0379, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0387, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0547, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0394, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0014, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0015, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0404, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0564, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0964, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0404, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0527, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0657, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0559, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0573, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0435, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0409, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0394, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0628, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0464, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0022, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0442, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0019, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0461, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0394, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0479, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0515, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0419, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0391, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0414, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0415, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0444, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0453, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0472, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0022, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0842, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0408, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0471, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0017, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0499, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0469, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0017, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0531, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0016, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0451, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0612, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0419, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0387, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0017, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0571, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0631, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0020, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0466, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0595, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0603, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0493, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0578, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0500, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0415, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0022, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0004, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0002, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0024, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0438, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0025, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0459, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0417, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0405, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0401, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0422, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0393, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0454, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0011, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0386, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0625, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0754, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0005, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0439, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0390, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0409, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0508, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0421, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0492, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0024, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0390, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0550, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0497, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0390, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0561, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0401, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0534, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0384, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0385, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0025, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0403, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0016, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0024, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0412, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0458, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0437, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0454, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0443, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0400, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0735, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0537, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0532, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0414, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0417, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0391, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0017, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0395, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0567, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0458, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0395, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0024, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0382, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0382, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0017, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0466, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0020, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(1.3853e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0773, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0477, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0451, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0396, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0390, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0705, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0021, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0437, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0617, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0020, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0018, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0478, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0430, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0436, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0427, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0435, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0456, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0391, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0408, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0417, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0452, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0460, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0454, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0455, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0442, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0482, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0400, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0638, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0025, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0408, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0467, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0011, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0021, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0002, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0010, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0024, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0007, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0012, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0556, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0528, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(8.0619e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0019, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0436, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0012, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0384, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0451, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0430, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0416, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0387, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0004, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0462, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0428, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0564, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0419, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0428, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0417, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0909, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0385, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0421, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0429, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(1.3725e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0420, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0541, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0461, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0379, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0426, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0445, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0400, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0385, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0547, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0474, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0415, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0485, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(2.5906e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0001, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0018, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0015, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0016, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0012, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0003, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(9.2028e-06, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0023, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0403, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0397, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0416, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0419, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0535, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0012, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0002, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0014, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0497, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0020, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0396, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0696, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0409, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0018, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0710, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0021, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0387, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0462, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0024, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0519, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0394, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0512, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0389, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0401, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0452, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0522, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0019, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0387, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0384, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0389, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0382, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0014, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0438, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0477, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0022, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0420, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0024, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0502, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0020, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0019, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0470, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0471, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0594, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0021, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0016, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0490, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0472, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0402, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0484, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0711, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0454, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0686, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(5.7143e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0011, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0025, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0003, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(5.5966e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0014, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0408, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0522, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0649, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0449, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0743, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0417, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0023, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0003, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0023, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0018, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0013, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0016, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0013, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(1.9074e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0005, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0391, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0012, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0004, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0019, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0001, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0403, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0481, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0011, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0463, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0005, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0719, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0022, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0015, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0738, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0427, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0385, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0449, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0704, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0435, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(2.0441e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0518, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0502, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0019, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0497, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0438, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0481, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0025, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0417, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0412, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0384, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0472, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0550, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0014, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0719, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0451, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0007, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0524, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0022, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0429, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0385, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0464, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0011, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0017, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0007, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0003, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0001, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0006, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(8.3410e-06, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0009, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0382, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0003, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0023, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(1.0927e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0008, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0025, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0025, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(2.6913e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0005, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0008, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(1.3252e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0025, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0022, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0003, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(7.9054e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0020, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0007, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(1.1141e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0022, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0004, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0004, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0017, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0020, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0019, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0002, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0468, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0016, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0024, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(3.3502e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(8.0970e-06, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(8.0970e-06, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(8.0952e-06, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0023, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0402, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0390, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0464, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(1.5772e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0436, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0514, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0499, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0379, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0651, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0384, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0385, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0396, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0477, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0582, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0405, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0432, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0436, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0443, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0025, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0023, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0024, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0414, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(1.7571e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0014, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0009, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0020, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0003, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(2.5666e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0001, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0022, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0004, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(9.9155e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(1.2732e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0010, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0011, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0015, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0005, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(8.7595e-06, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(3.9889e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0015, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0013, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0015, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0007, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0003, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(3.0648e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0024, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0019, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0008, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(1.0808e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0021, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(2.1696e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0024, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(3.6154e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0007, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(9.8171e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0014, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0019, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0006, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0011, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0015, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0010, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0002, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0012, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0012, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0024, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0015, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(1.8622e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0011, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0006, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(1.1124e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(7.1888e-06, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0002, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(8.9081e-06, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(7.9361e-06, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0013, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(1.0136e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(3.5782e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0008, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0005, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(7.5223e-06, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0483, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0635, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0002, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0002, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0002, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0002, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0002, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0002, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0002, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0002, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0002, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0002, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0002, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0002, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0002, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0002, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0528, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0019, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0475, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0456, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0379, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0612, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0447, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0483, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0020, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0644, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0382, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0025, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0022, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0024, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0008, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0022, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0007, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0015, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0017, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0379, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0023, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0024, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(3.2415e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0005, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0013, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0004, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0024, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0461, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0022, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0003, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0003, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0012, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0007, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0006, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0008, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0009, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0007, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0022, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0494, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0008, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0418, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0022, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0002, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0009, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0022, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0006, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0006, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0006, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0390, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0577, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0460, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0012, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(5.6879e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0401, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0016, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0439, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0524, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0006, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0018, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0014, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0411, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0428, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0508, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0583, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0004, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0008, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(9.0217e-06, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0006, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0021, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0001, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(5.4608e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(8.2231e-06, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0009, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0017, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0019, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0015, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0017, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0013, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0010, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(4.3064e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0009, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0405, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0008, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0009, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0006, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0022, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0564, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0019, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0454, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0425, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0404, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0010, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0006, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0431, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0634, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0514, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0475, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0469, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0429, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0394, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0020, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0560, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0551, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0410, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0008, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0585, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0394, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0504, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0507, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0476, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0013, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0022, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0002, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0025, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0004, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0019, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0430, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0563, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0401, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0019, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0383, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0014, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0406, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0458, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0518, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0619, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0023, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0419, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0440, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0549, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0561, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0025, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0411, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0387, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0428, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0411, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0407, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0428, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0408, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0018, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0858, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0587, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0441, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0480, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0533, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0443, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0416, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0024, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0384, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0520, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0580, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0389, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0527, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0388, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0441, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0431, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0435, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0512, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0497, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0402, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0387, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0490, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0485, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0509, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0745, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0502, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0009, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0389, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0384, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0452, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0424, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0402, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0018, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0392, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0502, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0011, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0021, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0393, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0398, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0570, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0453, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0415, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0407, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0454, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0511, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0382, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0416, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0476, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0406, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0390, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0445, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0443, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0455, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0500, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0444, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0393, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0383, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0497, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0384, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0438, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0548, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0473, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0479, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0401, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0441, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0402, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0405, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0407, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0401, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0582, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0420, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0384, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0409, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0441, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0515, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0385, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0379, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0431, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0411, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0668, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0419, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0482, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0452, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0405, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0400, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0600, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0391, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0480, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0433, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0458, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0583, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0511, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0431, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0554, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0431, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0439, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0449, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0402, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0400, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0477, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0439, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0603, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0679, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0551, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0399, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0016, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0546, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0525, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0379, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0446, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0483, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0493, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0439, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0436, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0514, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0627, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0010, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0400, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0382, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0413, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0403, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0428, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0761, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0409, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0448, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0393, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0518, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0007, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0022, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0483, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0696, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0433, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0391, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0398, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0014, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0388, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0401, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0420, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0412, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0407, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0535, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0498, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0019, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0458, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0010, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0451, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0379, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0024, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0391, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0437, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0423, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0464, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0004, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0563, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0022, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0007, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0976, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0422, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0463, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0382, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0019, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0383, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0398, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0387, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0389, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0429, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0404, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0394, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0437, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0391, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0023, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0491, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0017, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0486, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0428, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0430, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1024, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0449, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0474, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0005, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0393, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0015, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0547, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0463, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0390, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0476, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0406, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0421, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0384, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0461, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0436, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0405, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0469, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0401, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0024, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0024, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0382, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0406, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0421, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0468, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0020, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0024, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0468, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0605, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0025, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0383, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0024, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0022, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0424, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0422, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0418, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0435, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0397, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0387, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0404, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0023, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0398, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0408, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0012, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(4.4449e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0017, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0526, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0394, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0022, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0409, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0007, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0447, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0444, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0449, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0424, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0490, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0534, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0477, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0430, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0025, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0553, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0458, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0425, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0394, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0379, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0391, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0017, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0383, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0394, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0531, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0405, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0397, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0445, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0627, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0599, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0442, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0388, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0458, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0514, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0012, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0492, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0419, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0402, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0413, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0425, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0025, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0025, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0006, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0482, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0433, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0467, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0018, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0429, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0656, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0567, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0012, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0400, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0386, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0391, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0539, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0382, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0409, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0012, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0471, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0014, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0469, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0417, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0410, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0402, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0454, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0637, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0439, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0438, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0393, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0015, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0392, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0430, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0549, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0446, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0454, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0405, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0022, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0015, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0025, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0396, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0440, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0386, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0428, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0415, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0463, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0433, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0019, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0019, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0439, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0022, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0426, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0473, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0012, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0508, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0741, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0023, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0389, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0433, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0018, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0390, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0409, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0446, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0399, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0391, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0406, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0413, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0426, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0024, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0449, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0009, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0399, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0516, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0020, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0379, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0649, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0458, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0426, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0391, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0490, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0021, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0021, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0446, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0382, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0390, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0019, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0453, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0429, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0695, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0439, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0445, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0021, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0384, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0404, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0524, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0389, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0439, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0388, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0386, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0409, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0427, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0007, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0406, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0492, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0018, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0547, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0461, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0421, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0391, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0394, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0475, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0021, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0457, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0020, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0647, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0519, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0450, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0407, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0385, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0021, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0387, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0384, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0531, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0604, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0411, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0023, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0417, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0404, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0007, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0452, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0400, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0010, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0022, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0019, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0025, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0425, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0482, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0007, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0418, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0393, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0425, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0013, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0023, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0022, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0025, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0452, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0421, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0389, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0411, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0024, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0419, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0427, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0463, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0021, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0009, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0397, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0460, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0025, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0521, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0379, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0386, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0398, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0394, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0451, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0411, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0012, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0456, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0398, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0014, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0017, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0620, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0021, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0006, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0491, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0386, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0013, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0407, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0385, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0025, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0420, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0451, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0010, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0005, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0024, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0478, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0412, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0388, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0631, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0394, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0439, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0024, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0492, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0021, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0379, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0449, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0394, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0016, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0009, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0466, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0008, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0418, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0388, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0476, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0009, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0394, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0023, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0383, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0003, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0456, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0019, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0015, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0011, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0425, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0020, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0010, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0020, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0417, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(1.8313e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(5.1067e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0003, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0009, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0020, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0413, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0399, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0523, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0394, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0007, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0011, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0003, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0024, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0404, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0429, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0022, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0017, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0393, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0016, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(8.4815e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0003, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0006, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0013, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0012, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0386, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0025, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0016, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0538, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0536, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0532, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0390, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0002, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0014, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0504, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0013, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0494, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0015, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0008, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0021, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0009, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0005, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0010, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0675, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0024, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0018, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0390, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0012, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0008, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0005, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(2.8943e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(3.1384e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0005, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(2.0484e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(1.7240e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(1.7653e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0005, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0009, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0006, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(1.7136e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0003, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(2.2512e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(1.8329e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0005, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0011, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(2.3637e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0002, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(2.4965e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0002, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(1.9750e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(3.4121e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(1.8549e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0003, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(2.8263e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(2.1062e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(2.9043e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(2.1133e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(2.8672e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(1.7699e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(2.3570e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(1.7911e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(2.7811e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0005, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(1.7468e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(1.7297e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(2.0327e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(1.8457e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0002, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(1.9905e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(1.8802e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(1.9767e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(2.4564e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(1.9200e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(2.0031e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(1.7647e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(1.8358e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(1.7027e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0002, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(4.1387e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(3.2570e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0002, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(2.8332e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(1.6891e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(2.3546e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0010, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(2.2567e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(5.6431e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(1.9835e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(2.1947e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0007, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(1.9622e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(1.6695e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(1.6465e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(1.9230e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(2.3568e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0016, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(1.6113e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(2.0560e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(5.7062e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0008, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(1.9111e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0006, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(2.3859e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(1.6424e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(1.8647e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(2.2823e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(1.8090e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(1.9381e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(1.9020e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0002, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0004, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(3.3167e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(1.7064e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(3.2021e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0004, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(1.6031e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(1.8583e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0002, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0004, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0004, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0010, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(1.8951e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0002, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0003, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(1.7455e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0002, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0001, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0002, device='cuda:0', grad_fn=<NllLossBackward>)
--------------EPOCH 7-------------
Test Accuracy: tensor(0.4344, device='cuda:0')
Loss: tensor(0.0889, device='cuda:0', grad_fn=<NllLossBackward>)
Saving model at iteration: 0
Loss: tensor(0.0491, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0430, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0499, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0486, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0465, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0403, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0512, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0387, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0648, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0454, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0414, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0799, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0734, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0857, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0462, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0477, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0445, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0664, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0478, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0476, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0608, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0534, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0404, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0461, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0480, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0905, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0452, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0390, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0410, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0435, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0395, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0460, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0411, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0487, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0644, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0428, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0462, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0701, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0419, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0776, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0540, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0525, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0411, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0459, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0436, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0600, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0502, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0489, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0613, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0433, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0485, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0404, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0671, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0622, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0521, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0382, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0452, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0470, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0412, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0581, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0446, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0390, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0506, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0495, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0688, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0397, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0418, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0760, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0503, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0613, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0395, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0475, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0392, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0783, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0521, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0508, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0737, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0547, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0639, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0529, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0601, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0668, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0608, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0406, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0454, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0384, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0496, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0436, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0571, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0479, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0629, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0421, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0625, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0424, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0403, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0416, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0409, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0640, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0385, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0391, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0397, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0428, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0477, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0476, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0575, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0384, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0533, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0517, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0932, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0492, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0520, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0397, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0546, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0617, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0483, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0522, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0460, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0536, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0407, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0548, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0609, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0466, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0623, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0682, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0497, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0492, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0564, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0512, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0387, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0395, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0469, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0723, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0580, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0422, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0428, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0427, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0626, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0660, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0399, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0450, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0439, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0586, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0393, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0513, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0395, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0481, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0596, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0417, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0430, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0582, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0610, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0526, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0438, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0498, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0431, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0459, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0385, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0618, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0474, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0435, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0487, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0569, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0477, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0479, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0440, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0405, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0462, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0549, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0433, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0590, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0424, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0473, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0403, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0445, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0412, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0479, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0404, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0464, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0429, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0421, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0712, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0417, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0421, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0419, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0452, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0599, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0441, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0909, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0527, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0413, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0504, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0410, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0385, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0405, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0561, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0602, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0569, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0550, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0389, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0590, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0487, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0619, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0489, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0574, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0528, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0537, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0396, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0390, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0515, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0564, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0391, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0492, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0474, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0455, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0600, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0508, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0572, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0507, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0789, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0430, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0443, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0386, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0443, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0534, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0459, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0634, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0901, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0444, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0384, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0510, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0423, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0842, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0673, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0641, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0734, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0486, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0402, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0516, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0512, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0398, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0392, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0398, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0486, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0486, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0479, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0614, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0657, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0399, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0534, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0411, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0489, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0388, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0736, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0451, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0457, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0605, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0456, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0441, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0547, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0517, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0566, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0620, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0521, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0499, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0489, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0551, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0645, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0423, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0417, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0408, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0421, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0779, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0407, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0599, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0686, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0619, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0559, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0591, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0483, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0524, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0534, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0714, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0634, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0485, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0400, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0420, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0515, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0459, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0390, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0499, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0752, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0482, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0485, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0497, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0530, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0646, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0411, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0602, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0548, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0389, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0435, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0480, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0606, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0410, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0546, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0447, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0395, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0474, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0726, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0431, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0466, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0455, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0462, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0539, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0395, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0685, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0566, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0459, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0591, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0590, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0598, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0392, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0669, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0665, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0428, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0517, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0471, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0591, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0494, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0557, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0387, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0676, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0480, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0410, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0646, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0470, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0427, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0448, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0569, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0486, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0725, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0591, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0427, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0702, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0588, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0416, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0961, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0440, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0426, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0610, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0495, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0574, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0436, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0389, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0468, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0412, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0439, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0483, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0390, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0417, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0425, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0479, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0383, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0489, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0418, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0563, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0541, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0741, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0649, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0893, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0532, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0387, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0523, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0512, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0531, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0621, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0523, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0383, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0451, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0511, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0537, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0628, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0483, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0427, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0469, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0516, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0533, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0450, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0444, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0879, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0629, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0491, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0454, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0405, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0526, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0443, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0462, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0417, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0454, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0532, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0415, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0593, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0515, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0499, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0566, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0523, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0487, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0396, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0429, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0415, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0386, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0517, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0508, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0518, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0596, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0469, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0493, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0534, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0502, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0466, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0816, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0647, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0533, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0422, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0449, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0434, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0713, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0403, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0580, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0535, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0469, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0604, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0519, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0438, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0806, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0579, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0392, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0469, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0510, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0646, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0442, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0474, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0759, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0478, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0580, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0413, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0484, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0512, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0650, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0413, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0519, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0591, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0415, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0435, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0791, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0505, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0507, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0475, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0642, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0509, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0530, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0680, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0437, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0435, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0396, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0499, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0550, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0400, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0543, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0446, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0487, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0712, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0441, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0541, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0591, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0443, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0538, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0389, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0396, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0662, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0408, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0430, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0470, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0627, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0695, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0795, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0496, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0750, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0399, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0682, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0472, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0421, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0891, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0422, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0508, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0416, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0405, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0399, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0685, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0596, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0505, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0408, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0471, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0668, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0410, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0478, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0581, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0493, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0617, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0574, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0793, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0501, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0501, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0529, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0736, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0526, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0383, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0608, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0649, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0530, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0398, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0529, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0455, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0407, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0469, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0472, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0562, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0463, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0457, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0460, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0387, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0676, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0459, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0541, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0512, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0850, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0496, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0637, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0419, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0389, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0579, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0421, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0457, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0529, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0724, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0506, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0972, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0408, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0399, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0508, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0396, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0438, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0416, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0473, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0633, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0434, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0512, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0476, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0411, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0474, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0508, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0582, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0434, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0434, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0598, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0577, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0412, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0635, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0396, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0532, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0414, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0410, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0396, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0408, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0434, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0533, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0505, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0400, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0754, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0541, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0573, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0427, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0517, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0606, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0393, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0447, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0496, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0454, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0392, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0393, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0398, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0464, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0546, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0437, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0435, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0413, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0450, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0522, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0690, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0457, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0408, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0574, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0508, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0628, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0433, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0509, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0431, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0950, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0629, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0512, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0605, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0423, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0681, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0456, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0651, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0502, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0477, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0618, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0501, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0580, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0593, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0650, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0460, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0420, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0438, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0699, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0544, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0635, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0461, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0662, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0491, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0389, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0409, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0594, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0506, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0509, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0683, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0504, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0541, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0394, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0436, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0448, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0530, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0379, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0475, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0462, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0447, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0662, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0538, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0413, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0792, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0647, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0619, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0687, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0788, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0516, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0671, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0567, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0552, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0457, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0384, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0501, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0388, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0462, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0427, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0618, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0515, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0388, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0437, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0488, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0433, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0489, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0534, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0554, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0506, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0518, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0656, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0449, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0449, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0500, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0706, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0523, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0404, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0460, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0413, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0463, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0428, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0484, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0430, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0634, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0404, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0440, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0389, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0558, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0388, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0529, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0410, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0382, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0400, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0594, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0473, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0649, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0475, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0392, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0728, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0459, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0488, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0387, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0563, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0572, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0554, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0590, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0519, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0424, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0399, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0549, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0522, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0423, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0511, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0418, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0496, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0385, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0443, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0832, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0400, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0444, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0401, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0679, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0462, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0431, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0487, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0393, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0505, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0418, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0530, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0449, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0427, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0407, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0402, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0448, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0567, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0396, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0469, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0392, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0545, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0585, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0440, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0615, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0389, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0479, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0575, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0401, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0456, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0397, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0439, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0868, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0385, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0561, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0583, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0470, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0522, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0410, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0522, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0571, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0529, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0471, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0547, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0611, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0475, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0556, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0520, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0603, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0850, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0442, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0441, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0387, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0474, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0401, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0708, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0413, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0481, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0487, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0477, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0571, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0608, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0400, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0488, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0428, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0570, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0445, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0456, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0416, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0537, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0570, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0471, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0492, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0413, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0520, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0458, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0417, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0574, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0473, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0538, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0536, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0626, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0715, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0649, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0543, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0536, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0505, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0695, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0522, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0590, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0396, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0889, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0445, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0396, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0384, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0397, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0512, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0467, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0421, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0821, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0466, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0430, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0532, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0501, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0395, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0452, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0400, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0406, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0387, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0446, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0637, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0487, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0392, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0506, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0428, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0744, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0789, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0451, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0379, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0696, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0724, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0525, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0676, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0480, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0441, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0501, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0632, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0454, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0754, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0401, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0388, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0403, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0503, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0389, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0482, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0568, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0433, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0658, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0625, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0580, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0437, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0679, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0489, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0512, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0419, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0636, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0513, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0489, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0525, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0430, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0632, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0523, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0675, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0521, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0422, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0401, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0703, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0419, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0650, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0557, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0432, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0623, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0430, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0433, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0423, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0428, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0520, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0501, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0663, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0623, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0394, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0383, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0485, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0513, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0820, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0448, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0559, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0382, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0409, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0405, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0473, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0478, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0547, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0658, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0486, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0603, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0446, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0413, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0384, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0475, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0954, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0953, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0408, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0440, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0544, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0472, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0452, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0573, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0612, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0456, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0385, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0648, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0384, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0513, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0462, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0389, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0521, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0439, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0720, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0399, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0502, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0538, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0628, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0590, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0513, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0429, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0658, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0550, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0684, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0762, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0412, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0512, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0434, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0400, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0602, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0447, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0466, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0503, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0577, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0581, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0546, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0536, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0526, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0582, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0383, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0398, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0480, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0491, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0395, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0447, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0480, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0668, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0398, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0489, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0454, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0485, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0385, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0586, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0576, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0395, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0596, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0424, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0433, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0521, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0508, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0709, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0586, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0693, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0422, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0745, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0485, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0384, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0452, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0628, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0556, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0385, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0473, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0504, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0401, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0643, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0770, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0580, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0428, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0400, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0383, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0560, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0483, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0579, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0506, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0476, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0618, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0418, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0415, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0413, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0634, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0934, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0388, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0441, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0706, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0671, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0449, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0396, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0438, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0404, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0466, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0450, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0440, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0754, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1021, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0487, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0489, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0388, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0431, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0638, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0559, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0471, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0529, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0714, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0396, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0494, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0396, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0571, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0426, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0560, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0772, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0479, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0545, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0500, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0392, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0588, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0470, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0468, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0598, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0594, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0392, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0718, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0394, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0636, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0411, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0548, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0432, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0385, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0460, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0699, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0437, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0400, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0595, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0455, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0620, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0500, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0582, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0516, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0403, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0414, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0630, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0614, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0404, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0459, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0514, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0520, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0446, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0395, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0560, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0493, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0394, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0516, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0501, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0394, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0542, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0407, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0755, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0505, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0498, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0634, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0824, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0441, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0549, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0414, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0539, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0511, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0609, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0451, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0510, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0413, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0826, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0597, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0449, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0521, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0690, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0988, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0464, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0504, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0482, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0467, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0421, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0455, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0616, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0482, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0628, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0388, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0460, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0418, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0391, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0599, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0410, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0411, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0398, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0672, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0447, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0414, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0600, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0741, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0500, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0665, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0448, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0532, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0429, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0382, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0422, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0552, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0474, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0408, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0416, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0746, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0513, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0510, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0386, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0429, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0438, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0542, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0383, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0390, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0742, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0488, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0507, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0587, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0672, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0528, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0447, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0577, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0612, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0577, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0458, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0577, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0578, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0412, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0423, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0477, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0624, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0395, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0662, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0675, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0564, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0401, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0505, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0647, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0384, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0487, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0512, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0562, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0581, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0404, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0472, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0512, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0410, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0466, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0492, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0559, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0393, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0476, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0540, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0829, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0857, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0412, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0457, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0599, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0641, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0460, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0522, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0600, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0491, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0743, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0454, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0495, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0429, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0399, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0464, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0537, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0395, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0514, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0518, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0633, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0509, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0445, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0450, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0511, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0489, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0465, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0415, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0393, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0415, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0622, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0391, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0469, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0872, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0510, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0557, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0413, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0479, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0406, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0496, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0485, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0476, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0635, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0502, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0583, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0398, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0403, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0515, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0711, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0704, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0424, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0524, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0423, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0398, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0483, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0394, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0480, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0420, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0613, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0395, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0686, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0699, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0575, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0990, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0454, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0486, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0412, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0460, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0607, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0514, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0485, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0470, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0486, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0421, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0675, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0449, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0454, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0471, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0644, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0495, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0477, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0410, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0485, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0460, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0474, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0439, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0451, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0385, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0851, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0480, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0791, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0426, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0769, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0404, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0461, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0464, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0439, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0394, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0511, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0481, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0474, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0505, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0401, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0574, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0577, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0507, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0820, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0580, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0418, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0612, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0573, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0522, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0676, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0411, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0460, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0473, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0605, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0580, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0417, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0481, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0502, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0395, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0425, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0401, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0430, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0404, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0625, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0409, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0400, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0552, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0429, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0761, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0415, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0502, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0714, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0505, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0576, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0420, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0420, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0478, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0586, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0528, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0401, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0407, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0465, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0638, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0548, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0413, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0447, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0578, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0463, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0401, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0437, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0766, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0595, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0707, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0715, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0643, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0447, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0397, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0430, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0478, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0578, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0644, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0500, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0503, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0397, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0535, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0665, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0624, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0415, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0589, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0400, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0474, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0486, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0508, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0398, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0440, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0496, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0401, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0526, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0396, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0442, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0483, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0462, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0536, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0651, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0460, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0457, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0428, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0755, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0407, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0428, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0532, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0423, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0868, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0786, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0452, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0598, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0404, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0621, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0522, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0614, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0608, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0464, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0546, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0439, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0511, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0464, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0413, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0666, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0489, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0494, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0671, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0399, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0436, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0450, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0424, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0476, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0576, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0493, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0392, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0568, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0748, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0452, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0725, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0450, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0395, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0668, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0394, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0629, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0422, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0391, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0485, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0530, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0492, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0407, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0543, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0382, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0384, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0418, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0465, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0548, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0434, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0501, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0656, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0462, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0388, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0778, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0572, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0545, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0498, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0449, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0519, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0484, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0662, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0607, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0433, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0443, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0497, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0422, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0399, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0579, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0664, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0461, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0428, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0457, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0547, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0421, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0435, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0636, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0472, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0618, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0474, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0398, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0438, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0461, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0400, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0420, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0444, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0419, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0627, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0772, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0512, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0546, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0498, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0395, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0728, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0438, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0499, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0463, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0684, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0460, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0389, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0608, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0428, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0622, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0589, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0553, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0440, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0693, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0529, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0511, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0456, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0580, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0469, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0488, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0463, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0578, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0397, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0385, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0541, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0559, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0638, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0438, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0499, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0486, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0566, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0613, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0502, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0688, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0454, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0539, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0659, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0567, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0540, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0388, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0392, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0544, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0434, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0756, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0592, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0501, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0556, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0409, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0429, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0407, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0425, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0498, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0480, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0492, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0605, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0479, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0392, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0494, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0518, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0464, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0394, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0505, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0467, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0383, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0509, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0391, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0414, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0551, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0683, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0640, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0399, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0644, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0480, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0579, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0398, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0532, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0828, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0511, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0420, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0747, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0566, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0449, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0474, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0437, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0441, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0415, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0502, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0475, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0410, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0528, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0426, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0496, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0389, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0570, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0442, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0387, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0625, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0588, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0737, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0597, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0617, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0443, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0595, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0576, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0401, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0550, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0427, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0454, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0665, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0439, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0388, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0405, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0495, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0492, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0565, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0392, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0528, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0439, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0409, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0469, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0587, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0431, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0444, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0445, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0423, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0431, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0387, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0554, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0589, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0520, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0529, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0411, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0710, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0389, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0406, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0458, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0475, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0414, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0498, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0874, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0437, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0389, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0524, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0486, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0418, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0447, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0485, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0549, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0510, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0513, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0730, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0479, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0420, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0534, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0705, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0416, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0416, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0692, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0417, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0536, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0591, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0593, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0407, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0386, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0425, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0389, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0456, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0652, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0386, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0574, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0762, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0677, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0445, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0398, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0585, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0496, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0557, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0427, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0417, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0700, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0472, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0465, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0522, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0497, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0519, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0600, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0409, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0439, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0412, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0447, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0589, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0383, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0556, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0502, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0500, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0399, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0436, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0383, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0654, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0449, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0493, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0439, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0466, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0563, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0468, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0638, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0509, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0552, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0527, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0520, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0412, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0384, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0465, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0583, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0543, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0494, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0461, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0517, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0442, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0447, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0648, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0456, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0398, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0464, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0484, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0605, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0403, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0557, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0418, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0513, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0607, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0399, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0563, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0379, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0458, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0469, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0507, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0561, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0495, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0394, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0612, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0608, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0518, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0427, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0619, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0423, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0437, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0425, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0386, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0517, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0601, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0727, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0482, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0434, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0584, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0489, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0379, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0622, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0578, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0473, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0386, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0425, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0515, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0448, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0435, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0547, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0610, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0506, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0459, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0581, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0489, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0492, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0444, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0475, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0458, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0679, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0533, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0394, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0484, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0432, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0477, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0467, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0424, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0574, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0623, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0556, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0421, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0616, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0535, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0439, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0452, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0672, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0411, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0475, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0418, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0417, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0631, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0544, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0615, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0609, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1015, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0406, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0395, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0469, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0429, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0413, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0715, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0406, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0491, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0596, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0390, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0439, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0564, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0632, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0403, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0395, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0872, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0448, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0415, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0592, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0659, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0470, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0469, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0453, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0642, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0499, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0429, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0930, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0398, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0682, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0451, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0566, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0384, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0497, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0417, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0499, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0571, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0547, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0431, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0457, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0608, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0567, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0398, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0436, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0654, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0648, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0388, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0397, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0527, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0386, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0799, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0518, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0617, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0566, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0487, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0494, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0413, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0606, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0564, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0399, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0425, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0715, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0621, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0394, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0423, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0451, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0630, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0486, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0438, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0424, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0503, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0484, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0455, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0428, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0532, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0903, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0382, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0552, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0383, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0401, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0505, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0548, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0434, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0479, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0852, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0576, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0419, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0635, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0409, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0833, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0448, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0680, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0637, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0528, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0465, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0640, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0746, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0460, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0477, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0421, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0700, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0442, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0403, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0487, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0455, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0560, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0448, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0439, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0402, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0419, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0849, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0472, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0486, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0561, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0577, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0393, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0575, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0388, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0675, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0481, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0385, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0461, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0418, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0404, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0525, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0475, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0511, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0486, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0476, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0596, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0430, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0523, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0416, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0416, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0425, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0457, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0895, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0449, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0538, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0421, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0465, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0457, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0519, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0533, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0398, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0384, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0486, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0712, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0748, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0449, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0434, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0609, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0497, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0764, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0470, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0385, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0457, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0609, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0407, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0448, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0588, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0593, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0485, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0518, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0575, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0502, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0583, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0426, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0560, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0398, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0516, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0536, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0519, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0530, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0493, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0538, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0469, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0516, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0403, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0534, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0512, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0526, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0438, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0785, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0483, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0402, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0445, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0556, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0512, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0509, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0436, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0822, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0388, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0475, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0448, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0553, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0401, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0620, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0384, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0522, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0379, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0585, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0432, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0501, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0535, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0678, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0618, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0407, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0404, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0405, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0478, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0629, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0424, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0690, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0710, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0437, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0501, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0440, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0775, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0490, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0445, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0426, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0586, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0396, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0443, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0471, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0453, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0550, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0600, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0395, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0446, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0484, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0700, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0546, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0703, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0532, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0408, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0446, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0683, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0521, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0417, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0478, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0420, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0561, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0435, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0736, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0490, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0474, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0499, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0399, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0409, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0489, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0581, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0593, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0616, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0390, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0486, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0496, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0433, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0386, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0604, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0450, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0498, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0493, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0477, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0437, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0424, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0589, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0476, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0444, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0388, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0529, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0567, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0404, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0681, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0476, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0493, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0394, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0431, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0411, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0533, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0398, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0666, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0530, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0669, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0670, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0585, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0455, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0539, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0384, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0873, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0384, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0914, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0580, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0515, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0426, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0534, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0436, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0500, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0386, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0460, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0490, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0592, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0393, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0528, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0586, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0460, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0624, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0384, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0444, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0447, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0677, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0549, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0455, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0422, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0557, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0448, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0488, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0657, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0551, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0498, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0648, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0557, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0389, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0657, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0616, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0600, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0646, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0436, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0525, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0541, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0576, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0600, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0389, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0653, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0543, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0427, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0921, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0828, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0744, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0403, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0447, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0528, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0554, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0396, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0606, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0473, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0560, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0528, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0637, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0466, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0443, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0383, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0470, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0428, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0444, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0548, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0487, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0539, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0654, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0642, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0424, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0508, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0411, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0576, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0596, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0469, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0537, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0401, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0696, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0415, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0403, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0422, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0473, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0461, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0750, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0398, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0400, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0412, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0475, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0468, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0902, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0650, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0431, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0692, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0660, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0486, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0441, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0447, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0451, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0456, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0505, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0702, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0469, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0427, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0438, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0661, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0427, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0509, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0699, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0431, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0652, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0454, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0615, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0395, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0405, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0595, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0508, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0587, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0405, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0495, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0578, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0427, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0490, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0442, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0401, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0444, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0444, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0467, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0452, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0483, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0476, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0885, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0496, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0453, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0625, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0698, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0584, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0702, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0706, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0420, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0431, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0460, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0472, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0379, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0592, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0411, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0469, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0803, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0486, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0617, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0410, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0465, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0439, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0516, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0379, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0425, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0461, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0458, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0480, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0422, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0479, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0616, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0484, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0603, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0650, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0476, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0504, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0532, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0398, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0406, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0561, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0432, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0531, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0567, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0454, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0435, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0652, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0720, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0498, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0612, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0923, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0518, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0560, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0631, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0384, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0696, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0021, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0617, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0468, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0552, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0464, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0521, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0483, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0666, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0438, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0731, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0534, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0514, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0560, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0760, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0398, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0420, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0664, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0392, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0650, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0464, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0450, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0510, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0416, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0571, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0435, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0452, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0435, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0641, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0726, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0573, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0391, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0928, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0541, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0571, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0403, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0576, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0448, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0592, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0661, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0631, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0543, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0562, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0444, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0384, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0523, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0498, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0446, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0492, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0617, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0506, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0539, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0455, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0448, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0424, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0419, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0785, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0579, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0436, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0540, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0469, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0580, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0679, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0421, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0581, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0459, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0465, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0481, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0509, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0572, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0550, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0624, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0551, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0439, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0546, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0560, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0972, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0503, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0505, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0764, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0477, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0791, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0435, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0414, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0476, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0452, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0451, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0717, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0423, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0512, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0739, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0458, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0435, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0816, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0548, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0495, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0640, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0448, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0402, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0602, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0402, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0483, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0434, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0450, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0440, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0476, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0423, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0464, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0498, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0541, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0400, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0616, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0643, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0419, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0388, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0473, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0590, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0586, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0489, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0526, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0421, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0527, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0395, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0394, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0462, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0432, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0426, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0611, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0779, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0620, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0629, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0462, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0615, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0386, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0392, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0392, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0437, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0594, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0432, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0408, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0439, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0619, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0388, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0414, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0687, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0602, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0446, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0509, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0455, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0401, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0426, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0423, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0389, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0407, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0573, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0562, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0383, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0478, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0438, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0494, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0383, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0478, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0735, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0538, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0565, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0451, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0745, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0769, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0567, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0388, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0520, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0435, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0620, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0393, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0490, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0469, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0402, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0467, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0431, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0401, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0450, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0386, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0471, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0423, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0448, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0513, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0525, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0473, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0750, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0521, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0563, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0389, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0424, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0398, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0575, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0398, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0634, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0477, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0877, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0398, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0473, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0659, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0400, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0433, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0407, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0425, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0428, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0656, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0449, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0415, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0459, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0459, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0411, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0416, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0671, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0524, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0502, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0699, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0392, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0529, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0848, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0384, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0726, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0477, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0438, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0478, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0725, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0614, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0793, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0640, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0593, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0505, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0437, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0505, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0606, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0446, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0454, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0560, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0410, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0518, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0461, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0725, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0516, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0491, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0416, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0020, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0453, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0576, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0469, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0385, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0401, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0416, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0415, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0497, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0406, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0727, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0645, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0752, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0445, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0526, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0634, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0440, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0489, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0443, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0481, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0500, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0699, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0477, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0407, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0502, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0446, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0420, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0648, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0534, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0429, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0505, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0406, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0605, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0509, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0486, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0621, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0430, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0629, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0642, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0769, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0459, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0444, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0402, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0468, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0511, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0590, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0412, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0658, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0383, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0419, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0578, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0420, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0553, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0539, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0580, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0411, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0409, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0516, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0499, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0385, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0460, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0379, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0400, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0492, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0491, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0408, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0464, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0509, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0454, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0396, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0402, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0479, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0599, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0489, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0392, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0455, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0453, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0599, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0417, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0422, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0400, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0669, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0403, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0557, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0730, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0437, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0629, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0570, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0713, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0412, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0382, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0480, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0568, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0392, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0574, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0438, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0553, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0441, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0500, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0434, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0418, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0427, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0501, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0408, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0395, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0522, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0399, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0698, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0419, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0536, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0390, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0404, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0528, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0556, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0592, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0405, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0659, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0418, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0389, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0434, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0541, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0427, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0497, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0395, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0379, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0465, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0454, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0686, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0699, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0534, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0495, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0412, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0636, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0834, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0439, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0754, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0394, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0379, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0605, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0543, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0488, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0540, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0436, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0451, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0420, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0649, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0492, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0543, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0476, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0401, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0450, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0447, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0379, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0409, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0635, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0497, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0733, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0397, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0383, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0450, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0775, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0547, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0527, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0499, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0463, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0494, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0453, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0640, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0530, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0587, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0424, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0403, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0429, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0429, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0573, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0422, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0481, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0640, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0432, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0751, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0438, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0519, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0560, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0412, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0504, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0393, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0584, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0631, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0478, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0675, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0391, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0435, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0477, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0412, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0526, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0383, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0415, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0739, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0435, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0488, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0782, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0421, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0439, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0613, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0697, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0421, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0628, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0695, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0446, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0841, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0480, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0447, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0427, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0418, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0499, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0613, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0384, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0431, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0495, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0432, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0562, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0484, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0908, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0456, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0716, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0541, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0573, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0717, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0391, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0415, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0442, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0561, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0460, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0379, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0434, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0383, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0525, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0421, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0436, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0471, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0496, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0393, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0629, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0413, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0532, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0461, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0427, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0405, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0447, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0505, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0418, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0409, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0462, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0439, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0822, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0510, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0409, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0394, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0595, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0491, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0403, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0688, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0476, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0417, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0423, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0622, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0450, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0415, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0388, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0444, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0491, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0616, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0552, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0448, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0392, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0387, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0414, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0389, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0582, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0413, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0491, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0408, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0514, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0400, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0413, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0464, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0507, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0530, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0494, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0398, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0388, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0434, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0412, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0389, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0435, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0390, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0446, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0425, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0546, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0684, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0450, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0387, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0411, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0470, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0579, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0389, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0410, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0400, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0452, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0410, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0495, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0395, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0393, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0445, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0414, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0419, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0422, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0484, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0528, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0575, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0414, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0409, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0425, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0415, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0718, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0469, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0482, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0400, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0540, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0434, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0424, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0425, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0668, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0518, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0455, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0391, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0442, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0462, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0637, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0391, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0566, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0401, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0458, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0502, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0505, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0489, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0413, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0514, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0404, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0434, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0409, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0441, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0394, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0402, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0444, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0391, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0422, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0387, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0486, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0486, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0516, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0402, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0388, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0384, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0517, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0423, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0018, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0571, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0396, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0506, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0390, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0400, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0542, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0451, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0527, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0484, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0409, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0465, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0593, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0628, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0502, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0387, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0444, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0420, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0429, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0488, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0602, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0468, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0424, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0397, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0411, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0538, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0644, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0416, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0390, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0524, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0445, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0487, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0384, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0496, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0450, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0454, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0397, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0471, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0396, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0383, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0474, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0427, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0395, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0710, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0409, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0448, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0420, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0482, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0394, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0476, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0550, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0487, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0403, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0408, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0443, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0649, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0398, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0484, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0489, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0649, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0433, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0476, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0464, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0379, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0425, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0499, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0414, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0530, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0476, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0412, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0524, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0534, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0423, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0442, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0479, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0520, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0397, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0494, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0528, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0422, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0597, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0391, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0432, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0385, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0410, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0401, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0651, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0393, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0549, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0444, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0384, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0397, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0427, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0393, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0598, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0736, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0691, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0466, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0383, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0391, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0437, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0458, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0410, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0418, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0602, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0552, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0543, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0413, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0468, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0453, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0479, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0410, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0466, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0529, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0426, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0396, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0565, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0410, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0515, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0431, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0493, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0506, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0413, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0423, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0424, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0417, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0487, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0458, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0542, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0462, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0404, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0725, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0538, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0476, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0429, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0397, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0384, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0427, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0401, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0379, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0427, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0406, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0444, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0550, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0416, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0572, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0420, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0390, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0408, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0389, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0427, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0393, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0556, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0466, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0570, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0482, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0394, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0445, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0562, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0446, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0550, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0455, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0395, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0395, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0491, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0507, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0430, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0391, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0487, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0424, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0441, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0423, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0461, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0658, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0427, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0425, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0647, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0513, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0580, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0576, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0466, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0891, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0421, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0487, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0396, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0393, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0393, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0396, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0404, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0394, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0401, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0538, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0675, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0529, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0410, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0413, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0390, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0437, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0482, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0661, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0384, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0392, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0412, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0396, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0394, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0406, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0549, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0442, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0467, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0423, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0019, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0399, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0443, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0619, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0539, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0471, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0457, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0514, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0487, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0419, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0443, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0382, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0502, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0421, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0382, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0426, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0391, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0394, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0456, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0021, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0659, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0427, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0434, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0485, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0402, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0466, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0520, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0382, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0018, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0497, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0451, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0551, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0542, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0423, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0552, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0483, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0387, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0457, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0386, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0397, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0443, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0389, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0507, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0435, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0423, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0386, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0512, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0408, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0393, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0409, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0448, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0485, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0571, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0426, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0024, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0548, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0382, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0443, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0022, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0401, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0446, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0445, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0667, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0440, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0468, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0499, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0011, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0402, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0416, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0391, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0423, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0513, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0420, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0569, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0426, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0391, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0438, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0413, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0422, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0474, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0487, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0470, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0422, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0459, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0590, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0742, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0022, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0410, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0415, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0415, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0432, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0429, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0542, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0451, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0397, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0497, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0424, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0601, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0897, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0427, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0524, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0476, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0408, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0011, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0025, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0510, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0413, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0460, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0435, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0384, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0401, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0389, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0434, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0767, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0472, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0505, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0541, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0726, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0018, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0406, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0389, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0436, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0442, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0387, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0385, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0478, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0408, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0388, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0002, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0010, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0396, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0564, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0443, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0023, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0435, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0400, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0392, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0555, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0568, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0382, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0552, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0746, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0534, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0427, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0390, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0387, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0405, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0008, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0404, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0462, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0393, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0475, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0403, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0492, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0452, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0455, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0428, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0420, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0408, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0645, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0505, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0393, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0395, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0382, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0011, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0012, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0472, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0413, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0644, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0655, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0424, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0427, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0409, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0460, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0390, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0394, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0726, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0394, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0421, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0577, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0379, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0465, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0410, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0426, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0417, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0460, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0397, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0507, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0479, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0485, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0426, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0493, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0395, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0526, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0434, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0489, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0400, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0399, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0436, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0019, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0021, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0463, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0005, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0389, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0584, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0487, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0449, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0464, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0451, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0013, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0025, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0417, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0384, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0529, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0017, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0433, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0397, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0400, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0405, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0473, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0007, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0427, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0393, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0443, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0407, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0013, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0452, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0018, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0471, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0396, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0395, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0430, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0415, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0392, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0559, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0537, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0396, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0564, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0424, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0383, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0005, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0015, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0421, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0404, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0479, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0390, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(2.5119e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0466, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0429, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0457, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0444, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0396, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0388, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0016, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0025, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0384, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0493, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0403, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0441, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0416, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0429, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0022, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0414, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0395, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0437, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0014, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0597, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0455, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0410, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0019, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0474, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0408, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0518, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0557, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0437, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0492, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0431, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0454, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0421, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0410, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0404, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0023, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0593, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0403, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0415, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0393, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0494, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0025, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0442, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0507, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0398, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0379, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0473, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0016, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0404, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0433, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0423, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0424, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0025, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0447, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0415, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0007, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0017, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0022, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0407, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0433, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0014, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0397, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0484, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0465, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0458, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0410, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0439, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0025, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0021, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0388, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0018, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0005, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0021, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0387, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0467, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0018, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0449, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0590, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0392, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0012, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0533, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0582, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0439, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0010, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0501, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0444, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0464, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0400, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0415, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0006, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0398, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0408, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0021, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0391, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0496, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0460, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0549, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0478, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0394, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0447, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0453, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0472, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0015, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0383, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0491, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0383, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0024, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0404, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0397, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0407, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0403, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0023, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0470, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0458, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0379, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0018, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0386, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0389, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0395, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0414, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0390, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0435, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0450, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0454, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0021, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0505, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0012, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0492, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0397, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0403, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0399, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0456, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0418, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0394, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0401, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0734, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0400, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0418, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0398, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0425, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0433, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0404, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0560, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0401, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0023, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0015, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0006, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0431, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0419, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0413, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0434, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0516, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0421, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0450, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0465, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0674, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0455, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0485, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0018, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0432, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0445, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0020, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0383, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0425, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0022, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0654, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0714, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0471, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0383, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0582, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0406, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0565, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0442, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0414, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0023, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0715, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0018, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0504, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0811, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0023, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0479, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0011, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0674, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0397, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0729, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0020, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0466, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0022, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0025, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0393, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0466, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0389, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0444, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0393, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0474, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0476, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0411, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0003, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0446, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0396, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0004, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0431, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0453, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0424, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0581, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0614, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0465, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0509, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0451, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0666, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0499, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0412, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0012, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0487, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0398, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0440, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0429, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0494, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0016, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0777, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0494, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0384, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0590, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0410, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0393, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0382, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0488, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0007, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0021, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0623, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0461, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0406, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0390, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0402, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0671, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0474, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0424, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0007, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0387, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0439, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0485, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0015, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0382, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0021, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0463, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0383, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0021, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0020, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0395, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0564, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0008, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0580, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0477, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0511, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0512, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0406, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0531, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0385, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0624, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0415, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0018, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0495, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0535, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0016, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0435, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0414, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0428, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0493, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0388, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0444, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0600, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0404, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0403, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0451, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0411, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0564, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0498, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0513, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0471, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0021, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0573, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0425, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0591, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0396, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0391, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0453, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0389, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0017, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0456, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0440, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0592, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0410, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0439, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0405, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0419, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0426, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0393, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0634, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0494, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0400, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0664, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0014, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0464, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0400, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0025, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0510, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0009, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0418, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0419, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0018, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0405, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0025, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0379, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0017, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0479, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0458, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0447, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0021, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0427, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0443, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0419, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0564, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0022, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0024, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0463, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0540, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0398, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0016, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0405, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0420, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0558, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0022, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0409, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0021, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0430, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0388, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0391, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0426, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0445, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0025, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0387, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0392, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0612, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0387, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0021, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0678, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0681, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0493, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0008, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0539, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0549, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0469, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0589, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0390, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0542, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0394, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0616, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0764, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0522, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0448, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0405, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0496, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0581, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0584, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0391, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0022, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0392, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0461, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0024, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0450, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0485, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0418, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0025, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0015, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0379, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0016, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0426, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0454, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0016, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0413, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0406, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0007, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0500, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0431, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0014, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0512, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0633, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0507, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0023, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0443, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0423, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0541, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0006, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0650, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0434, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0405, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0412, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0022, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0574, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0468, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0383, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0021, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0448, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0427, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0481, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0383, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0574, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0409, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0384, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0422, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0019, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0020, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0403, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0420, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0553, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0024, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0016, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0385, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0571, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0568, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0428, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0446, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0385, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0021, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0614, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0469, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0498, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0449, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0448, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0394, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0006, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0427, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0019, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0466, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0390, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0009, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0438, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0025, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0627, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0025, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0397, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0389, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0021, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0523, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0015, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0383, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0473, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0428, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0485, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0018, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0024, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0022, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0550, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0412, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0398, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0462, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0397, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0015, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0017, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0423, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0013, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0025, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0466, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0411, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0008, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0409, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0009, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0393, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0405, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0614, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0462, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0534, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0432, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0417, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0461, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0448, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0466, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0449, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0421, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0450, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0423, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0017, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0401, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0420, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0022, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0568, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0383, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0514, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0643, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0503, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0401, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0416, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0469, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0379, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0428, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0398, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0516, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0430, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0417, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0425, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0492, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0419, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0492, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0013, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0486, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0559, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0506, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0508, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0407, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0403, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0418, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0635, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0426, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0439, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0514, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0383, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0448, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0509, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0532, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0609, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0440, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0427, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0423, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0390, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0437, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0491, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0388, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0429, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0389, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0019, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0611, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0384, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0422, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0532, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0616, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0020, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0424, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0471, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0448, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0390, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0383, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0561, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0556, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0444, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0418, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0406, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0505, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0449, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0439, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0478, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0020, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0389, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0421, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0410, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0505, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0413, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0386, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0489, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0402, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0425, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0408, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0454, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0392, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0435, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0409, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0395, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0548, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0392, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0631, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0407, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0493, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0014, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0448, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0431, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0420, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0441, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0418, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0408, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0386, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0439, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0762, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0520, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0406, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0398, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0406, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0490, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0405, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0389, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0415, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0446, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0384, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0459, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0417, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0013, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0416, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0443, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0023, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0419, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0430, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0405, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0450, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0416, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0389, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0498, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0398, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0396, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0409, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0388, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0399, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0015, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0395, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0402, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0395, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0477, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0395, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0402, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0398, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0403, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0536, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0390, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0564, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0425, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0425, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0467, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0449, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0431, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0011, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0379, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0011, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0431, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0624, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0426, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0403, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0428, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0614, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0436, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0570, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0500, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0419, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0445, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0482, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0409, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0379, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0384, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0425, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0447, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0471, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0526, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0507, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0398, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0404, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0503, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0484, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0402, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0388, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0414, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0383, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0492, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0439, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0415, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0463, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0379, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0382, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0437, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0440, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0399, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0387, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0450, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0021, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0415, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0386, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0489, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0411, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0424, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0396, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0447, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0417, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0442, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0403, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0460, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0460, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0812, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0467, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0504, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0394, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0463, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0455, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0419, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0440, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0538, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0390, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0429, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0553, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0432, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0410, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0436, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0489, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0019, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0395, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0405, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0435, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0015, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0411, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0476, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0426, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0481, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0018, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0417, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0483, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0609, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0413, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0423, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0528, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0418, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0428, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0539, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0382, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0422, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0403, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0465, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0009, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0389, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0439, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0383, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0383, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0025, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0393, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0430, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0455, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0458, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0448, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0494, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0017, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0456, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0023, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0506, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0740, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0407, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0023, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0410, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0403, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0023, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0493, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0548, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0439, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0021, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0534, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0392, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0500, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0446, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0785, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0426, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0396, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0516, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0387, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0459, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0406, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0384, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0570, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0391, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0452, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0506, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0397, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0393, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0440, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0388, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0496, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0023, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0418, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0633, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0411, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0389, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0452, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0513, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0445, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0421, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0390, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0025, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0020, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0437, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0451, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0471, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0401, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0414, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0021, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0554, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0382, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0443, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0022, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0469, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0732, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0023, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0382, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0460, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0017, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0007, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0012, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0382, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0470, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0024, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0446, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0439, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0406, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0409, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0481, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0398, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0402, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0495, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0444, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0400, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0483, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0455, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0025, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0440, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0425, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0418, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0018, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0402, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0023, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0011, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0415, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0025, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0024, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0497, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0431, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0393, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0566, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0015, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0473, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0444, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0402, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0005, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0433, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0023, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0392, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0008, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0471, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0411, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0019, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(2.0849e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0005, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0446, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0427, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0451, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0419, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0020, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0013, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0428, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0025, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0018, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0396, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0480, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0518, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0023, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0400, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0016, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0428, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0023, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(2.2279e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(2.3195e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0003, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0006, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0389, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0454, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0435, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0496, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0018, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0431, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0440, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0499, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0389, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0008, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0014, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0021, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0458, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(2.0129e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0018, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0424, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0006, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(2.6780e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0008, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0011, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0014, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0599, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0466, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0413, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0414, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0415, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0413, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0012, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0016, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(2.1605e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0024, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0477, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0016, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0422, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(2.2339e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0441, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0386, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0449, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(2.0961e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0014, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0007, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0675, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0391, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0022, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0401, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0019, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0004, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0462, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0568, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0399, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0411, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0006, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0395, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(2.4134e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(2.2384e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0022, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0018, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0013, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0018, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0395, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0010, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0025, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0432, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0011, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(2.3880e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0016, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(2.4004e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(3.1613e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0018, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0003, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0426, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(2.1245e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(2.0489e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0007, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0379, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0018, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0499, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0457, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0478, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0024, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0503, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0478, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0017, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(2.1877e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0009, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(2.1414e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0002, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(2.1980e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0402, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0008, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0016, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0601, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0008, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0495, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0022, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0384, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(2.6909e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0016, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0489, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0001, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(2.1612e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(2.2274e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0006, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0020, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0002, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(2.1017e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0001, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(2.3796e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0021, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0012, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0003, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0569, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0396, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0023, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0522, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0394, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0401, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0391, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0863, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0422, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0479, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0412, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0025, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0393, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0011, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0009, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(2.1994e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0008, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0012, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0011, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0022, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(2.1462e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(2.0673e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0458, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0465, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0562, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0023, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0433, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0484, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0404, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0001, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0413, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0398, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0002, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0015, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0020, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0399, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0397, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0384, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0407, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0012, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0017, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0382, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0415, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0023, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0537, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0023, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0490, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0620, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0482, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0025, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0020, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0509, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0407, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0018, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0413, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0462, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0408, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0394, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0389, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0003, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0411, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0403, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0409, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0474, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0009, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0424, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0441, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0792, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0391, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0524, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0428, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0399, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0021, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0012, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0012, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0012, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0533, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0417, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0434, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0450, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0428, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0545, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0412, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0022, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0406, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0430, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0440, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0436, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0390, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0451, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0395, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0396, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0413, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0603, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0477, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0457, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0470, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0014, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0382, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0018, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0475, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0529, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0422, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0426, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0013, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0011, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0443, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0478, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0498, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0023, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0858, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0492, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0411, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0419, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0431, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0019, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0432, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0430, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0397, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0017, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0397, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0459, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0522, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0025, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0510, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0454, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0009, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0452, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0018, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(6.9757e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0004, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0641, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0437, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0398, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0448, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0442, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0025, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0010, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0017, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0016, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0008, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0409, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0010, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0457, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0455, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0416, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0599, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0573, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0441, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0391, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0022, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0016, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0404, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0397, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0387, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0499, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0005, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0006, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0021, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(2.8063e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0015, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0448, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0917, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0572, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0012, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0428, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0395, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0434, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0851, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0551, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0645, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0731, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0520, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0011, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0004, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0009, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0017, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0394, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0543, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0410, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0552, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0586, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0012, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0023, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0021, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0383, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0379, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(5.7464e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0022, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0397, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0003, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(1.9079e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(1.6975e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0012, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0012, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0002, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0019, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0574, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0484, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0016, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0570, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0457, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0003, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0014, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0014, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0384, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0545, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0437, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0011, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0455, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0447, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0421, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0443, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0437, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0013, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0022, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(2.0381e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0534, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0578, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0020, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0437, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0017, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0387, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0479, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0385, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0387, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0384, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0439, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0017, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0005, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0012, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0437, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(2.0042e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0011, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0431, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0016, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0432, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0426, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0400, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0644, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0505, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0548, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0440, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0571, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0445, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0406, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0713, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0395, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0400, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0459, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0395, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0017, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0020, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0468, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0526, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0483, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0429, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0386, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0449, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0476, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0409, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0560, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0546, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0020, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0017, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0560, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0402, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0432, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0013, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0748, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0410, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0021, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0503, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0600, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0406, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0454, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0432, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0411, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0394, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0783, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0407, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(3.4970e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0424, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0385, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0639, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0434, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0466, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(5.1960e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0394, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0734, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0481, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0434, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0390, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0395, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0433, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(2.7653e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0390, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0704, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0012, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0500, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0421, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(6.3246e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0004, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0007, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0584, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0396, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0023, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0382, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0401, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0404, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0457, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0543, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0408, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0502, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0393, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0419, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0392, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0022, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0022, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0503, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0017, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0384, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0534, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0419, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0456, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0014, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0395, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0021, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0484, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0515, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0434, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0013, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0005, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0407, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0443, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0010, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(2.8544e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0379, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0016, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0006, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0766, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0384, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0520, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0403, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0390, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0382, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0025, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0601, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0023, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0786, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0435, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0011, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0414, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0002, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0012, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0611, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0410, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0392, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0390, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0445, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0022, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0012, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0008, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0010, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0501, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0457, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0551, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0390, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0497, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(2.8124e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(7.6038e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0438, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0420, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0520, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0577, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0433, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0490, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0023, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0506, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0012, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0014, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0426, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0405, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0413, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0394, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0402, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0481, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0398, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0596, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0014, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0462, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0391, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0444, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0395, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0465, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0397, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0396, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0431, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0555, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0003, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0436, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0531, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0516, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0383, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0385, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0675, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0518, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0442, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0494, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0018, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0020, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0492, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0510, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0463, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0025, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0601, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0396, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0440, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0531, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0391, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0490, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0439, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0534, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0453, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0542, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0456, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0015, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0555, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0436, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0023, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0452, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0490, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0465, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0458, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0432, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0532, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0413, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0520, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0422, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0390, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0599, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0414, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0384, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0492, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0391, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0387, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0396, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0013, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0474, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0516, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0408, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0586, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0420, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0024, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0417, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0492, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0487, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0897, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0014, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0769, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0407, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0389, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0481, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0016, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0021, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0019, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0455, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0442, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0519, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0543, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0752, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0444, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0012, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0020, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0452, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0448, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0521, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0023, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0015, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0431, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0488, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0024, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0588, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0023, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0011, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0503, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0510, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0547, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0641, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0659, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0015, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0501, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0828, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0413, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0445, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0385, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0608, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0410, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0385, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0389, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0383, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0382, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0379, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0421, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0394, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0379, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0499, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0482, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0477, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0005, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0563, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0395, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0431, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0460, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0411, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0485, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0001, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0503, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0642, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0397, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0576, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0544, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0429, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0440, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0394, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0490, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0007, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0389, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0019, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0472, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0463, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0447, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0413, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0495, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0424, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0022, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0435, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0478, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0506, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0382, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0426, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0431, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0021, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0433, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0461, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0455, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0494, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0624, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0462, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0490, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0387, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0766, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0012, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0417, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0022, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0504, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0494, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0449, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0638, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0023, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0528, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0020, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0439, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0427, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0409, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0583, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0400, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0396, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0405, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0473, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0430, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0473, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0428, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0405, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0020, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0429, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0450, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0463, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0025, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0397, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0405, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0017, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0017, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0498, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0385, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0009, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0390, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0387, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0393, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0391, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0460, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0394, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0409, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0399, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0536, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0561, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0511, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0473, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0420, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0420, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0392, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0430, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0435, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0419, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0417, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0437, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0435, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0400, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0488, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0666, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0476, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0466, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0398, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0444, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0470, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0393, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0385, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0403, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0401, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0411, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0462, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0425, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0445, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0544, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0603, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0454, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0548, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0379, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0383, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0008, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0423, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0619, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0425, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0498, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0550, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0414, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0523, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0515, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0398, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0386, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0397, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0403, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0388, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0467, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0014, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0413, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0399, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0392, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0413, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0539, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0394, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0435, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0385, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0444, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0438, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0404, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0389, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0433, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0017, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0382, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0394, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0390, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0022, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0821, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0397, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0400, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0464, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0015, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0429, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0476, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0015, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0416, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0398, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0479, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0570, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0025, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0440, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0523, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0014, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0466, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0384, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0414, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0426, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0716, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0390, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0600, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0430, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0559, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0386, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0382, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0020, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0473, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0612, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0498, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0387, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0625, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0568, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0002, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0513, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0599, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0935, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0451, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0440, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0475, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0582, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0383, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0006, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0394, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0398, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0620, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0463, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0393, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0437, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0385, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0386, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0408, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0621, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0425, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0521, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0473, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0024, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0568, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0418, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0447, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0016, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0025, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0418, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0455, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0018, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0025, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0495, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0392, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0511, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0430, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0457, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0406, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0569, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0522, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0384, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0449, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0495, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0591, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0389, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0407, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0477, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0012, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0394, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0509, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0500, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0011, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0405, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0400, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0009, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0379, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0481, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0020, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0383, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0572, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0019, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0014, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0562, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0424, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0451, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0394, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0426, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0013, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0435, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0024, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0017, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0395, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0569, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0425, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0022, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0461, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0399, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0433, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0736, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0388, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0561, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0565, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0469, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0019, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0002, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0730, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0022, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0385, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0426, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0022, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0439, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0427, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0021, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0497, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0010, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0423, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0407, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0386, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0020, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0394, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0426, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0459, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0384, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0517, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0449, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0025, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0400, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0460, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0394, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0413, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0594, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0406, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0010, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0385, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0424, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0450, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0018, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0407, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0433, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0024, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0006, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0013, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0443, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0409, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0416, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0469, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0431, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0431, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0596, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0434, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0820, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0412, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0535, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0651, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0402, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0025, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0025, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0505, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0411, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0438, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0585, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0533, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0516, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0494, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0401, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0391, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0382, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0490, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0483, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0516, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0429, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0502, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0389, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0025, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0382, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0406, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0024, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0528, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0459, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0505, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0397, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0410, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0502, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0012, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0012, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0554, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0928, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0390, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0497, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0024, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0633, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0524, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0555, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0423, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0392, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0021, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0616, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0430, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0020, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0418, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0015, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0436, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0449, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0504, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0393, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0422, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0419, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0392, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0021, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0789, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0384, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0464, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0015, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0492, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0471, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0025, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0014, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0508, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0014, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0420, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0526, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0024, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0396, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0022, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0016, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0491, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0582, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0017, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0435, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0514, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0549, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0462, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0542, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0459, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0393, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0019, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0004, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0002, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0023, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0022, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0408, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0023, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0448, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0385, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0385, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0385, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0423, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0009, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0606, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0699, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0005, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0417, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0394, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0451, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0395, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0452, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0022, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0510, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0495, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0549, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0379, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0507, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0022, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0024, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0386, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0013, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0022, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0025, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0021, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0379, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0441, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0404, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0423, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0399, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0699, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0514, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0486, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0393, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0408, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0014, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0483, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0446, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0022, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0014, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0432, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0018, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(1.3151e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0023, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0716, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0455, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0399, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0643, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0016, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0398, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0574, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0018, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0015, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0448, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0391, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0404, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0422, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0412, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0433, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0390, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0446, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0423, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0438, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0443, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0399, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0455, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0575, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0021, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0394, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0415, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0009, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0024, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0002, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0009, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0019, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0005, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0011, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0531, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0501, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(6.9819e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0017, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0010, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0399, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0405, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0396, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0003, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0457, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0394, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0557, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0406, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0386, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0839, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0393, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0407, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(1.3281e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0025, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0025, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0386, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0522, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0022, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0421, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0401, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0429, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0479, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0463, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0384, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0441, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(2.7490e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0001, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0014, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0012, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0015, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0009, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0003, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0023, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(8.7000e-06, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0019, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0383, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0387, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0528, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0010, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0001, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0013, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0023, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0469, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0016, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0610, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0394, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0017, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0666, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0014, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0406, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0019, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0498, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0491, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0398, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0399, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0462, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0017, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0012, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0422, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0428, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0018, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0391, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0022, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0020, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0481, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0015, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0017, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0456, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0025, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0025, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0444, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0567, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0019, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0013, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0432, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0457, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0393, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0472, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0677, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0432, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0662, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(4.8142e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0010, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0023, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0003, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(4.9440e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0011, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0466, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0622, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0431, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0703, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0021, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0002, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0025, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0020, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0016, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0011, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0012, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0011, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(1.7959e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0004, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0010, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0003, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0017, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0001, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0401, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0446, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0009, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0462, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0005, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0676, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0020, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0013, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0701, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0386, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0388, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0415, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0658, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0430, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(1.8581e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0481, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0451, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0015, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0468, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0404, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0467, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0022, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0400, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0382, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0439, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0527, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0011, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0024, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0622, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0401, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0006, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0487, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0020, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0383, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0442, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0011, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0017, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0023, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0025, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0005, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0002, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0001, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0005, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(8.0548e-06, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0008, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0024, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0020, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0003, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0023, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(1.0776e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0007, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0019, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0023, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(2.3680e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0004, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0008, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(1.2678e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0023, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0023, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0021, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0025, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0003, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(5.6809e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0016, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0006, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(1.1059e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0017, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0003, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0003, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0014, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0017, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0016, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0001, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0452, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0013, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0019, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(3.3345e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(7.8343e-06, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(7.8343e-06, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(7.8314e-06, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0022, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0433, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(1.4858e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0471, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0455, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0594, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0383, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0454, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0558, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0407, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0416, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0410, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0021, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0019, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0023, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0386, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(1.6469e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0012, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0008, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0018, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0003, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(2.2622e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0001, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0025, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0025, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0015, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0004, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(8.1216e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(1.2234e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0023, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0009, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0010, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0014, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0004, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(8.2039e-06, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0019, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(3.6915e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0014, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0011, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0014, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0006, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0002, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0020, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(2.4273e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0020, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0022, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0015, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0006, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(1.0308e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0017, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(1.8536e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0022, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(2.9652e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0005, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(7.4712e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0012, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0016, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0005, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0011, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0015, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0009, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0002, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0012, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0011, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0020, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0013, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(1.8197e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0008, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0004, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0024, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0024, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(1.0164e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(6.8069e-06, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0002, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(8.4338e-06, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(7.4489e-06, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0012, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(9.6144e-06, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(3.5215e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0007, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0004, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(7.3202e-06, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0439, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0600, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0002, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0002, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0002, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0002, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0002, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0002, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0002, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0002, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0002, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0002, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0002, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0002, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0002, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0002, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0503, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0025, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0019, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0437, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0385, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0024, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0025, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0549, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0425, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0476, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0017, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0592, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0022, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0023, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0017, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0021, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0008, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0017, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0025, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0005, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0014, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0016, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0020, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0021, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0022, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0021, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0024, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0025, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(3.1248e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0004, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0010, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0003, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0025, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0408, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0020, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0003, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0024, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0003, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0009, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0007, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0005, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0007, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0008, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0006, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0020, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0461, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0006, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0387, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0017, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0002, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0005, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0023, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0025, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0006, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0005, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0005, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0019, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0025, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0518, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0423, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0012, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(5.2615e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0388, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0013, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0386, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0508, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0005, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0016, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0012, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0382, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0399, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0465, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0558, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0003, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0008, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(8.6198e-06, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0005, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0018, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0001, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0023, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(4.8533e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(7.9685e-06, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0008, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0014, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0015, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0012, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0014, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0010, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0009, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(3.6820e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0006, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0386, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0006, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0009, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0006, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0021, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0529, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0018, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0421, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0411, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0008, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0005, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0422, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0594, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0504, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0455, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0023, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0463, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0025, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0399, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0018, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0497, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0541, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0006, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0542, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0489, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0448, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0446, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0011, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0022, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0002, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0022, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0022, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0025, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0003, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0019, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0417, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0482, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0019, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0011, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0400, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0436, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0472, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0626, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0019, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0395, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0431, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0528, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0494, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0020, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0379, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0398, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0392, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0399, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0401, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0014, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0816, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0494, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0419, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0440, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0477, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0427, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0388, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0021, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0437, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0553, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0489, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0416, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0400, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0408, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0486, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0478, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0472, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0416, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0486, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0702, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0488, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0007, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0407, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0399, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0015, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0458, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0024, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0010, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0017, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0525, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0433, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0400, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0433, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0452, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0388, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0459, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0383, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0434, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0433, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0401, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0478, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0437, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0429, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0428, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0505, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0423, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0470, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0024, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0025, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0420, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0414, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0396, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0388, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0574, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0383, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0023, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0425, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0491, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0402, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0590, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0444, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0439, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0552, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0475, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0411, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0423, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0564, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0476, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0383, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0513, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0393, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0424, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0379, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0384, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0388, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0570, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0663, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0497, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0014, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0525, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0489, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0420, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0465, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0471, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0395, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0393, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0473, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0583, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0011, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0393, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0022, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0388, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0714, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0400, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0437, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0394, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0025, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0491, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0005, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0018, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0446, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0669, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0405, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0012, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0382, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0390, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0384, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0505, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0477, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0015, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0424, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0008, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0416, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0020, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0402, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0417, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0446, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0004, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0538, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0018, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0006, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0942, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0403, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0428, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0017, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0379, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0387, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0382, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0020, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0418, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0019, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0408, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0015, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0480, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0412, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0390, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0905, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0386, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0423, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0004, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0013, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0499, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0440, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0024, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0023, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0379, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0446, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0025, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0394, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0416, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0437, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0425, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0022, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0452, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0394, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0020, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0018, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0388, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0469, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0025, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0016, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0023, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0433, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0574, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0023, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0023, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0019, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0394, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0390, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0422, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0400, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0396, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0391, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0017, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0389, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0401, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0011, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(3.7999e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0013, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0478, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0021, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0394, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0006, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0417, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0433, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0420, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0482, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0501, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0460, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0417, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0020, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0521, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0416, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0427, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0379, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0015, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0507, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0384, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0436, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0598, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0022, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0540, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0399, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0417, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0463, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0011, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0448, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0396, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0021, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0400, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0021, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0023, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0006, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0468, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0418, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0432, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0019, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0408, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0625, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0549, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0012, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0379, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0521, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0024, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0011, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0427, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0012, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0419, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0417, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0397, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0436, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0575, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0409, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0429, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0013, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0388, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0532, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0423, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0428, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0387, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0019, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0015, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0024, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0438, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0407, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0430, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0415, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0018, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0016, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0414, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0024, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0020, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0021, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0408, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0481, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0011, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0475, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0706, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0020, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0409, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0016, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0393, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0423, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0387, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0387, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0403, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0020, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0388, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0008, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0478, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0017, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0024, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0622, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0442, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0025, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0402, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0460, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0019, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0018, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0024, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0432, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0017, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0439, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0023, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0416, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0663, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0425, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0420, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0019, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0388, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0501, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0422, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0390, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0406, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0006, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0024, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0443, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0018, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0520, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0443, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0400, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0460, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0020, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0025, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0431, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0015, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0629, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0394, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0428, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0018, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0464, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0025, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0583, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0408, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0021, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0392, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0406, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0006, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0430, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0392, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0009, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0021, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0018, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0019, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0428, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0434, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0005, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0407, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0401, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0012, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0020, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0019, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0025, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0021, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0441, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0403, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0402, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0019, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0404, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0413, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0432, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0019, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0024, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0008, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0418, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0023, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0480, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0382, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0025, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0393, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0011, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0439, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0010, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0015, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0025, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0022, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0528, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0016, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0005, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0480, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0011, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0024, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0025, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0397, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0435, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0008, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0004, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0021, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0462, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0025, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0635, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0025, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0387, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0427, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0023, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0463, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0017, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0434, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0023, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0385, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0013, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0008, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0465, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0007, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0406, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0457, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0009, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0025, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0018, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0024, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0002, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0024, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0424, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0017, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0013, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0011, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0398, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0020, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0022, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0010, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0017, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0382, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(1.7334e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(4.9769e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0002, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0008, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0019, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0379, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0384, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0504, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0006, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0010, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0003, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0025, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0387, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0406, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0020, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0015, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0025, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0013, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(7.7939e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0002, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0005, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0010, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0009, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0023, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0022, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0013, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0505, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0506, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0444, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0024, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0022, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0002, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0013, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0022, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0487, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0011, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0476, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0013, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0007, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0019, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0007, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0005, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0009, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0630, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0021, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0021, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0016, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0387, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0021, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0011, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0008, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0005, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(2.8314e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(2.8582e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0006, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(1.9209e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(1.5556e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(1.6109e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0004, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0008, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0005, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(1.5497e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0002, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(2.1094e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(1.6915e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0004, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0009, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(2.1647e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0002, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(2.3519e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0002, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(1.8354e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(3.3634e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(1.6841e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0003, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(2.4964e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(1.9784e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(2.6205e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(1.8580e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(2.4947e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(1.5977e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(2.1350e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(1.6281e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(2.5031e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0004, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(1.5886e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(1.5791e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(1.8239e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(1.6531e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0002, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(1.8519e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(1.7240e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(1.8381e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(2.3789e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(1.7947e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(1.8537e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(1.6255e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(1.6990e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(1.5503e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0002, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(4.1317e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(3.1778e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0002, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(2.6593e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(1.5501e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(2.1246e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0009, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(2.1429e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(4.6885e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(1.8452e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(1.9720e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0008, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(1.7778e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(1.5152e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(1.4822e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(1.7838e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(2.0786e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0015, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(1.4583e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(1.8923e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(4.7772e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0008, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(1.7295e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0005, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(2.3149e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(1.4856e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(1.7345e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(2.0629e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(1.6548e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(1.7866e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(1.6737e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0002, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0004, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(3.1039e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(1.5540e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(2.7472e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0004, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(1.4673e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(1.6048e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0002, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0004, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0004, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0009, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(1.7958e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0002, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0003, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(1.6454e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0002, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0001, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0002, device='cuda:0', grad_fn=<NllLossBackward>)
--------------EPOCH 8-------------
Test Accuracy: tensor(0.3607, device='cuda:0')
Loss: tensor(0.0841, device='cuda:0', grad_fn=<NllLossBackward>)
Saving model at iteration: 0
Loss: tensor(0.0420, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0386, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0429, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0438, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0430, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0470, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0592, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0436, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0400, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0708, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0681, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0800, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0431, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0457, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0379, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0594, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0424, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0423, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0532, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0508, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0387, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0412, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0433, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0870, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0425, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0386, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0390, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0450, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0383, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0480, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0571, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0399, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0449, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0640, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0693, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0515, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0491, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0471, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0411, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0530, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0437, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0452, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0582, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0399, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0419, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0388, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0611, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0528, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0443, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0414, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0396, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0522, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0398, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0428, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0417, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0610, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0385, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0659, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0438, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0537, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0426, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0713, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0452, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0445, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0722, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0488, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0582, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0467, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0565, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0590, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0579, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0411, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0448, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0386, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0450, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0424, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0536, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0387, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0537, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0391, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0385, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0565, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0385, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0438, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0446, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0544, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0499, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0480, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0896, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0466, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0478, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0388, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0536, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0570, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0438, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0495, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0426, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0497, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0434, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0499, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0420, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0606, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0643, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0462, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0436, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0510, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0507, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0419, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0621, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0531, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0408, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0552, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0624, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0401, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0395, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0415, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0410, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0572, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0457, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0443, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0572, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0387, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0416, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0555, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0568, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0486, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0379, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0414, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0404, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0428, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0574, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0416, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0418, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0445, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0487, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0437, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0424, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0389, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0415, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0417, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0503, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0400, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0557, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0397, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0411, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0405, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0383, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0462, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0438, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0429, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0687, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0398, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0402, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0392, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0535, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0396, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0791, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0485, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0473, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0379, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0530, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0555, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0514, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0520, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0563, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0439, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0565, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0435, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0505, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0456, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0494, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0438, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0458, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0457, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0434, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0420, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0533, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0476, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0543, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0500, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0743, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0398, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0407, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0396, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0429, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0420, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0568, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0809, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0406, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0463, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0752, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0606, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0595, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0641, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0432, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0485, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0527, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0391, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0466, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0460, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0443, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0600, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0584, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0390, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0421, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0438, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0689, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0422, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0397, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0548, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0437, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0384, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0530, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0525, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0564, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0559, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0526, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0474, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0456, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0499, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0600, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0387, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0391, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0389, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0745, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0541, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0605, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0618, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0495, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0553, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0454, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0455, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0485, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0687, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0579, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0452, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0384, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0447, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0435, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0453, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0711, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0431, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0474, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0483, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0465, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0622, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0422, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0560, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0527, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0408, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0394, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0602, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0458, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0402, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0452, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0696, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0390, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0452, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0414, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0448, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0484, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0590, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0506, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0408, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0573, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0543, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0550, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0599, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0630, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0408, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0464, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0440, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0499, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0415, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0492, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0625, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0464, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0413, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0616, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0460, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0396, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0553, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0460, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0687, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0543, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0387, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0676, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0523, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0394, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0862, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0980, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0383, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0567, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0448, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0515, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0379, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0443, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0391, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0438, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0418, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0394, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0396, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0408, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0388, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0432, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0406, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0528, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0510, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0684, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0591, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0854, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0501, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0499, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0489, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0506, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0596, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0429, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0439, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0435, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0522, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0572, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0433, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0425, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0446, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0500, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0407, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0429, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0817, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0597, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0429, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0416, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0473, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0404, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0449, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0395, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0448, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0503, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0553, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0494, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0462, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0500, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0461, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0451, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0431, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0473, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0463, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0456, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0537, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0414, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0443, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0501, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0449, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0445, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0821, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0574, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0486, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0419, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0401, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0665, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0397, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0485, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0485, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0443, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0507, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0504, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0384, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0736, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0571, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0436, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0485, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0590, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0397, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0432, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0701, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0445, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0528, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0394, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0464, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0422, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0604, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0384, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0446, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0535, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0387, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0757, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0525, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0426, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0412, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0582, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0446, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0488, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0615, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0379, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0384, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0462, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0501, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0503, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0417, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0444, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0614, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0429, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0499, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0566, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0411, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0530, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0630, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0385, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0414, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0587, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0649, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0733, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0422, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0727, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0592, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0422, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0412, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0859, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0382, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0490, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0385, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0672, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0579, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0460, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0456, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0578, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0426, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0548, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0449, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0559, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0524, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0746, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0472, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0439, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0464, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0709, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0497, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0571, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0604, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0463, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0511, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0416, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0379, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0422, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0435, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0510, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0431, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0407, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0460, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0618, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0441, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0497, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0472, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0769, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0431, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0607, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0563, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0397, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0417, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0462, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0670, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0445, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0902, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0476, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0412, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0410, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0436, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0591, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0397, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0459, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0434, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0387, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0414, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0472, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0511, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0392, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0398, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0575, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0506, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0571, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0385, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0497, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0396, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0384, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0399, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0498, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0487, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0391, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0644, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0499, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0505, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0465, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0542, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0379, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0412, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0422, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0433, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0382, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0446, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0464, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0419, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0393, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0441, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0463, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0674, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0394, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0531, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0490, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0550, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0382, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0457, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0382, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0864, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0594, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0479, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0556, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0379, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0606, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0427, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0591, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0472, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0436, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0559, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0446, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0547, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0517, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0586, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0433, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0658, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0486, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0568, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0428, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0624, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0497, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0425, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0552, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0458, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0470, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0640, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0472, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0476, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0426, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0390, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0495, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0457, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0439, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0406, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0622, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0512, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0412, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0735, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0624, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0559, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0662, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0747, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0505, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0555, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0473, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0500, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0417, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0468, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0406, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0581, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0465, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0419, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0465, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0392, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0467, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0481, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0476, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0485, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0502, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0624, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0432, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0405, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0451, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0671, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0491, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0390, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0424, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0458, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0405, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0452, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0389, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0577, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0413, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0408, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0518, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0511, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0387, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0550, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0419, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0586, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0421, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0648, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0410, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0404, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0469, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0545, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0521, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0566, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0488, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0502, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0458, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0398, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0470, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0396, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0462, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0404, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0771, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0385, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0672, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0413, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0412, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0429, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0457, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0500, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0420, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0388, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0384, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0419, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0505, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0454, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0379, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0528, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0571, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0431, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0576, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0443, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0499, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0387, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0770, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0535, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0561, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0436, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0447, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0386, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0501, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0580, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0477, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0415, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0500, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0580, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0451, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0509, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0493, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0569, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0807, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0402, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0420, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0418, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0658, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0406, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0431, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0470, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0526, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0554, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0388, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0454, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0397, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0501, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0419, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0439, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0401, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0468, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0522, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0471, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0479, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0468, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0379, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0383, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0552, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0438, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0503, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0466, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0575, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0690, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0590, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0490, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0501, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0496, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0637, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0435, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0492, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0791, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0469, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0441, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0384, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0752, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0405, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0404, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0513, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0448, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0446, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0384, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0425, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0572, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0430, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0462, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0392, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0740, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0761, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0606, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0680, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0459, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0615, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0455, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0405, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0449, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0573, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0444, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0682, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0385, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0497, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0465, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0526, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0406, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0611, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0561, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0532, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0601, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0443, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0476, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0445, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0616, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0492, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0432, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0488, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0627, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0502, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0614, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0506, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0583, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0394, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0633, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0541, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0385, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0594, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0408, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0401, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0399, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0398, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0503, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0452, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0604, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0525, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0444, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0408, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0797, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0424, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0490, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0379, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0425, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0452, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0490, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0605, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0391, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0454, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0520, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0419, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0446, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0906, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0890, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0472, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0434, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0398, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0557, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0577, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0456, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0632, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0517, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0411, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0488, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0388, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0683, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0449, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0471, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0604, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0532, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0435, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0396, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0606, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0521, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0666, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0712, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0384, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0463, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0417, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0577, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0425, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0435, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0450, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0579, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0544, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0513, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0484, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0494, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0510, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0387, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0455, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0472, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0417, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0459, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0616, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0445, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0392, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0403, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0553, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0526, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0539, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0421, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0399, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0520, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0469, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0662, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0553, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0606, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0720, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0465, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0445, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0596, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0522, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0454, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0512, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0561, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0734, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0506, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0417, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0522, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0474, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0551, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0472, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0423, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0566, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0382, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0595, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0831, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0399, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0666, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0632, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0425, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0394, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0442, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0412, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0389, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0692, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0969, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0453, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0470, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0610, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0543, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0400, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0466, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0695, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0393, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0455, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0510, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0410, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0515, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0739, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0440, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0532, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0455, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0531, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0427, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0429, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0562, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0526, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0582, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0398, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0587, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0390, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0496, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0383, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0434, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0626, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0413, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0556, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0425, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0557, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0399, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0470, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0465, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0611, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0512, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0387, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0421, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0498, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0461, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0424, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0522, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0455, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0499, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0477, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0387, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0476, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0401, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0753, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0469, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0459, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0591, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0732, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0426, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0498, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0490, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0465, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0572, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0431, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0462, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0731, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0567, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0431, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0472, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0647, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0944, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0437, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0515, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0449, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0428, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0406, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0596, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0458, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0611, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0443, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0569, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0395, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0657, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0399, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0573, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0720, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0450, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0604, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0388, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0530, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0382, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0497, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0419, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0379, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0673, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0450, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0459, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0404, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0413, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0478, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0383, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0711, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0461, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0507, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0549, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0594, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0438, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0405, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0510, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0555, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0546, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0390, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0534, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0518, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0388, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0399, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0446, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0597, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0637, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0644, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0531, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0464, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0577, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0450, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0438, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0537, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0532, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0452, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0474, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0419, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0474, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0516, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0435, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0517, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0771, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0775, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0397, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0413, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0542, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0600, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0453, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0487, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0553, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0395, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0720, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0415, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0427, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0414, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0385, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0447, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0454, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0495, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0494, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0587, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0471, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0444, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0424, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0456, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0417, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0408, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0410, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0384, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0567, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0419, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0791, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0468, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0528, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0385, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0440, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0389, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0465, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0450, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0428, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0605, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0491, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0528, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0407, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0489, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0700, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0680, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0387, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0496, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0384, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0391, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0441, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0450, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0435, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0559, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0624, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0674, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0506, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0898, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0450, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0464, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0388, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0411, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0531, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0477, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0445, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0441, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0449, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0635, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0413, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0385, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0442, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0609, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0470, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0426, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0387, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0452, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0411, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0464, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0420, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0458, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0812, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0445, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0727, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0379, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0702, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0422, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0440, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0498, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0424, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0440, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0447, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0520, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0558, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0498, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0752, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0522, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0534, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0536, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0478, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0643, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0421, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0425, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0545, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0569, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0396, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0429, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0466, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0400, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0396, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0420, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0590, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0384, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0529, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0389, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0717, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0461, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0622, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0454, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0572, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0465, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0564, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0464, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0423, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0571, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0489, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0388, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0404, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0558, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0434, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0421, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0679, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0532, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0645, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0686, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0595, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0416, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0414, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0527, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0580, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0424, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0473, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0515, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0595, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0589, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0548, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0385, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0415, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0464, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0448, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0410, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0473, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0495, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0379, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0394, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0463, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0439, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0458, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0619, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0435, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0436, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0714, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0383, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0443, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0424, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0825, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0724, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0423, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0566, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0428, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0570, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0477, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0563, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0564, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0428, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0462, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0385, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0462, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0427, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0625, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0440, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0455, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0642, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0385, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0410, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0411, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0407, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0433, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0562, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0481, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0475, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0699, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0412, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0718, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0432, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0387, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0610, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0587, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0477, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0455, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0496, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0386, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0427, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0499, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0454, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0612, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0433, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0703, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0535, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0488, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0433, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0409, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0468, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0452, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0617, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0565, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0421, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0437, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0477, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0395, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0392, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0564, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0625, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0434, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0428, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0510, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0401, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0398, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0581, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0459, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0553, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0454, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0433, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0447, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0383, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0387, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0549, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0709, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0470, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0468, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0489, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0654, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0448, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0390, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0651, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0443, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0513, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0416, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0563, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0534, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0512, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0404, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0661, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0463, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0451, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0454, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0497, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0393, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0435, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0433, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0540, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0504, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0478, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0580, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0409, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0450, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0459, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0534, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0579, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0454, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0603, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0453, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0505, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0585, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0519, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0510, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0525, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0394, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0710, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0562, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0448, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0502, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0396, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0402, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0387, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0425, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0416, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0449, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0530, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0462, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0446, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0423, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0446, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0447, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0445, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0461, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0390, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0509, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0657, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0582, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0574, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0410, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0545, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0496, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0818, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0445, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0379, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0723, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0531, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0424, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0445, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0412, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0403, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0456, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0443, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0501, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0390, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0468, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0532, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0618, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0556, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0665, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0558, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0598, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0518, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0564, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0556, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0393, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0425, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0578, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0398, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0428, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0456, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0445, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0535, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0464, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0402, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0420, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0533, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0412, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0412, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0396, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0387, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0384, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0499, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0563, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0462, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0491, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0646, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0450, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0430, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0423, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0475, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0818, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0411, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0510, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0460, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0411, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0425, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0533, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0479, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0450, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0637, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0427, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0391, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0497, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0685, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0401, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0390, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0659, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0394, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0487, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0551, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0520, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0393, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0395, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0394, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0623, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0513, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0727, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0602, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0403, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0549, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0474, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0518, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0382, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0616, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0428, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0462, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0445, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0474, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0464, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0592, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0398, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0397, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0394, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0551, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0533, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0456, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0474, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0408, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0618, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0409, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0465, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0421, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0419, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0529, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0440, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0607, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0446, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0507, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0465, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0453, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0386, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0443, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0556, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0496, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0445, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0392, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0491, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0410, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0422, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0608, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0408, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0385, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0425, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0457, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0523, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0539, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0386, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0486, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0566, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0522, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0443, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0431, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0474, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0483, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0476, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0556, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0587, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0470, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0562, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0431, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0395, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0445, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0578, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0679, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0408, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0441, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0515, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0468, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0574, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0536, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0464, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0400, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0463, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0425, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0496, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0570, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0473, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0419, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0537, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0433, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0398, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0409, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0425, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0417, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0608, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0479, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0428, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0414, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0451, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0451, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0425, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0532, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0508, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0536, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0389, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0563, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0518, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0399, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0408, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0637, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0440, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0628, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0543, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0574, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0557, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1007, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0447, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0385, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0689, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0417, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0561, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0407, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0464, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0604, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0836, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0431, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0495, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0536, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0456, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0454, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0423, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0613, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0454, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0386, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0877, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0629, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0420, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0501, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0464, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0382, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0468, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0567, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0496, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0398, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0577, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0523, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0388, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0417, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0584, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0597, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0474, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0733, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0479, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0629, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0496, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0423, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0442, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0536, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0508, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0411, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0668, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0558, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0396, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0567, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0450, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0414, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0397, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0489, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0382, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0415, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0408, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0481, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0888, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0520, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0383, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0449, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0519, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0390, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0458, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0810, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0544, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0387, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0521, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0825, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0418, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0606, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0607, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0466, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0425, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0604, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0690, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0424, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0419, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0383, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0624, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0394, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0428, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0430, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0467, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0413, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0388, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0387, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0772, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0445, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0467, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0538, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0554, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0553, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0616, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0464, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0471, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0383, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0486, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0430, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0439, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0428, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0408, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0581, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0479, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0410, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0423, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0824, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0429, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0527, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0409, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0431, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0474, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0502, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0383, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0443, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0674, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0631, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0401, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0571, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0479, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0731, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0432, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0455, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0564, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0382, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0431, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0615, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0547, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0453, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0439, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0550, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0442, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0509, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0411, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0526, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0446, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0523, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0442, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0492, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0477, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0504, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0434, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0484, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0391, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0489, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0468, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0490, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0756, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0412, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0379, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0503, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0477, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0423, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0441, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0778, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0421, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0396, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0508, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0562, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0480, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0534, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0409, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0417, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0477, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0644, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0572, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0383, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0439, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0588, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0682, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0660, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0469, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0408, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0780, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0465, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0419, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0412, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0530, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0421, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0435, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0390, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0481, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0555, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0399, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0463, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0621, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0508, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0678, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0428, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0647, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0497, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0411, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0383, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0511, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0401, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0674, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0467, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0452, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0479, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0450, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0493, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0575, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0554, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0472, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0437, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0522, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0405, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0474, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0454, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0477, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0386, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0385, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0517, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0423, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0432, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0461, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0567, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0609, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0393, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0491, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0395, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0401, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0469, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0385, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0575, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0492, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0631, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0624, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0545, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0426, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0494, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0822, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0881, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0507, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0473, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0391, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0510, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0415, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0448, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0401, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0457, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0554, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0493, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0593, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0437, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0536, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0429, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0384, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0601, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0517, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0393, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0413, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0528, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0397, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0457, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0616, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0493, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0449, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0514, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0532, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0615, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0554, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0545, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0634, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0413, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0483, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0515, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0548, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0546, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0599, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0505, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0418, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0892, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0773, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0643, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0427, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0513, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0532, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0603, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0445, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0463, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0494, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0573, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0419, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0387, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0386, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0441, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0409, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0422, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0486, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0427, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0463, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0628, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0539, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0461, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0516, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0511, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0440, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0530, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0385, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0608, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0391, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0462, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0448, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0718, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0408, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0399, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0432, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0455, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0811, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0560, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0657, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0563, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0487, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0391, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0425, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0414, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0425, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0473, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0624, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0434, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0418, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0611, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0407, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0463, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0710, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0415, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0643, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0414, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0569, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0558, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0462, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0522, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0390, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0455, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0531, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0396, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0468, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0419, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0403, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0402, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0449, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0441, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0432, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0414, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0856, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0428, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0422, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0572, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0671, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0503, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0638, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0658, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0411, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0407, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0412, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0399, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0537, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0389, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0415, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0728, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0441, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0542, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0383, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0428, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0435, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0441, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0392, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0432, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0447, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0445, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0433, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0617, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0459, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0556, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0598, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0447, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0475, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0470, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0391, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0514, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0397, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0509, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0523, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0432, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0412, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0612, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0699, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0454, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0568, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0839, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0458, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0478, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0626, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0657, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0016, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0562, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0437, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0559, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0435, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0492, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0441, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0610, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0396, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0703, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0483, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0488, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0483, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0714, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0610, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0560, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0433, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0409, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0488, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0513, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0425, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0423, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0582, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0667, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0535, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0908, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0503, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0469, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0379, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0545, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0425, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0557, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0577, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0596, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0476, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0509, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0399, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0454, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0422, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0425, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0421, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0554, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0496, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0504, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0418, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0412, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0712, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0546, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0437, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0499, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0435, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0548, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0663, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0402, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0542, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0403, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0461, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0483, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0478, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0461, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0518, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0576, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0571, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0416, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0481, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0553, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0918, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0462, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0451, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0727, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0415, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0744, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0386, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0392, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0440, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0428, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0422, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0632, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0393, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0494, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0703, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0405, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0411, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0749, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0505, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0463, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0616, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0394, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0587, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0407, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0393, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0426, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0412, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0450, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0406, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0428, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0387, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0466, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0503, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0545, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0621, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0410, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0436, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0539, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0537, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0453, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0489, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0385, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0471, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0398, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0423, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0399, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0536, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0699, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0569, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0541, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0422, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0557, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0397, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0556, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0396, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0399, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0419, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0582, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0391, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0651, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0553, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0383, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0467, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0409, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0398, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0420, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0556, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0531, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0459, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0415, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0488, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0446, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0678, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0520, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0532, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0419, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0664, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0724, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0506, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0501, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0405, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0596, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0397, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0479, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0433, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0416, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0398, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0419, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0426, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0398, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0471, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0492, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0449, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0683, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0467, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0497, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0399, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0509, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0593, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0467, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0798, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0477, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0577, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0382, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0400, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0388, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0604, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0385, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0408, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0409, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0444, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0669, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0516, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0496, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0640, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0393, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0523, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0813, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0638, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0420, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0415, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0656, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0602, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0699, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0585, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0579, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0457, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0394, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0438, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0548, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0414, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0437, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0517, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0468, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0415, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0674, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0471, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0446, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0406, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0018, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0417, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0507, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0428, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0387, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0390, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0459, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0613, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0548, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0699, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0390, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0483, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0582, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0421, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0450, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0430, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0430, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0482, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0621, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0431, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0401, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0463, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0447, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0615, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0487, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0405, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0501, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0570, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0494, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0469, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0567, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0390, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0589, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0588, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0732, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0446, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0438, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0465, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0488, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0446, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0393, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0595, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0398, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0493, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0536, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0512, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0570, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0402, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0397, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0458, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0465, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0394, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0460, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0457, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0429, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0469, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0419, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0422, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0561, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0447, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0433, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0557, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0390, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0656, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0522, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0718, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0409, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0587, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0469, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0620, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0379, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0445, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0533, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0514, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0442, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0511, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0478, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0411, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0400, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0390, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0475, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0385, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0491, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0636, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0428, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0520, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0386, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0488, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0535, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0573, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0379, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0581, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0410, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0408, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0505, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0458, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0436, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0413, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0640, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0681, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0467, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0462, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0400, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0561, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0827, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0455, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0709, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0536, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0464, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0425, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0452, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0401, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0397, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0396, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0614, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0479, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0477, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0461, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0416, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0393, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0597, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0452, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0641, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0407, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0736, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0513, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0499, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0441, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0470, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0471, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0440, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0596, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0500, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0553, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0416, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0390, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0553, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0409, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0462, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0637, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0414, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0677, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0498, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0526, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0390, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0473, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0578, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0534, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0449, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0644, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0469, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0508, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0389, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0678, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0406, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0439, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0735, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0389, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0446, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0527, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0646, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0399, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0591, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0680, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0433, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0821, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0446, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0430, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0424, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0401, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0473, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0595, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0467, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0427, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0541, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0477, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0883, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0423, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0707, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0456, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0480, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0667, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0390, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0385, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0445, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0510, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0433, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0498, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0393, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0416, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0455, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0463, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0517, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0494, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0444, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0393, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0420, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0419, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0444, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0402, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0387, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0425, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0424, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0768, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0480, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0544, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0443, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0396, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0631, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0444, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0410, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0405, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0539, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0430, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0397, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0459, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0575, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0475, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0424, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0419, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0563, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0397, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0442, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0405, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0464, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0397, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0424, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0472, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0528, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0435, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0416, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0385, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0385, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0515, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0635, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0433, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0395, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0383, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0535, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0386, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0403, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0445, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0402, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0461, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0404, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0392, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0388, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0395, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0476, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0479, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0572, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0396, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0402, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0654, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0434, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0459, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0433, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0398, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0433, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0409, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0623, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0504, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0422, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0408, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0409, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0619, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0395, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0537, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0410, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0474, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0490, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0462, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0408, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0496, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0385, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0400, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0419, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0390, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0416, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0471, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0488, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0491, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0385, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0465, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0408, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0011, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0494, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0392, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0461, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0516, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0431, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0502, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0475, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0465, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0562, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0605, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0471, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0428, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0387, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0406, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0465, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0021, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0553, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0457, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0409, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0386, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0458, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0615, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0383, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0403, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0468, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0414, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0483, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0480, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0420, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0448, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0431, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0423, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0404, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0693, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0400, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0432, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0403, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0480, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0449, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0501, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0439, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0402, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0429, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0604, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0466, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0472, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0611, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0409, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0460, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0430, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0383, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0450, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0382, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0483, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0454, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0486, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0499, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0391, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0391, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0436, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0448, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0470, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0500, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0398, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0584, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0383, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0578, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0518, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0401, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0394, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0564, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0703, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0646, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0449, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0405, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0430, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0403, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0562, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0544, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0484, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0392, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0404, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0419, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0467, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0418, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0484, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0418, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0517, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0394, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0501, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0409, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0475, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0441, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0384, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0419, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0413, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0401, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0470, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0417, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0540, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0441, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0386, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0706, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0511, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0437, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0389, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0394, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0394, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0407, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0424, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0514, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0410, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0522, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0399, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0393, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0490, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0418, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0494, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0416, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0393, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0411, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0521, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0420, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0531, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0444, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0387, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0477, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0454, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0397, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0475, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0411, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0416, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0399, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0421, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0632, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0421, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0404, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0597, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0462, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0566, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0565, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0463, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0887, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0393, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0482, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0387, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0396, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0388, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0533, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0629, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0506, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0401, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0411, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0459, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0609, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0382, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0397, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0503, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0422, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0438, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0400, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0018, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0434, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0580, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0496, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0465, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0448, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0471, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0477, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0408, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0399, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0480, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0403, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0394, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0383, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0382, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0410, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0020, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0641, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0398, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0417, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0465, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0388, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0444, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0509, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0016, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0475, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0446, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0512, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0522, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0408, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0534, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0458, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0421, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0410, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0455, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0415, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0408, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0500, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0397, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0393, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0435, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0493, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0521, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0395, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0021, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0521, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0405, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0020, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0421, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0422, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0635, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0424, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0449, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0499, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0010, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0415, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0393, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0499, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0405, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0548, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0421, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0382, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0401, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0455, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0461, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0427, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0404, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0444, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0571, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0716, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0019, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0379, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0398, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0410, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0423, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0412, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0522, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0455, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0385, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0470, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0408, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0565, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0864, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0391, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0506, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0464, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0403, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0012, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0022, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0479, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0024, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0393, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0447, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0416, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0395, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0426, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0702, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0452, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0470, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0539, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0692, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0016, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0384, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0411, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0416, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0425, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0397, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0379, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0002, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0009, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0545, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0413, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0023, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0419, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0525, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0559, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0518, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0721, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0504, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0414, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0384, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0006, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0431, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0025, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0468, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0392, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0473, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0436, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0446, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0382, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0388, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0388, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0639, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0485, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0384, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0024, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0010, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0025, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0012, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0446, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0402, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0598, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0632, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0398, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0409, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0433, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0379, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0656, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0412, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0561, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0431, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0383, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0397, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0445, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0494, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0460, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0469, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0405, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0493, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0023, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0499, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0429, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0452, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0399, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0017, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0018, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0446, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0004, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0547, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0468, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0024, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0434, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0439, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0415, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0011, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0025, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0403, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0520, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0016, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0407, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0384, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0388, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0410, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0007, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0398, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0410, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0386, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0011, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0427, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0015, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0442, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0385, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0417, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0384, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0512, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0529, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0025, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0525, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0407, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0004, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0014, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0392, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0389, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0444, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(2.2841e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0436, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0392, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0409, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0430, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0014, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0022, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0458, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0398, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0405, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0386, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0019, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0394, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0395, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0011, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0548, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0416, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0401, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0015, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0439, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0386, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0446, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0506, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0422, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0481, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0413, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0025, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0394, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0395, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0384, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0020, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0554, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0390, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0475, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0020, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0024, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0425, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0489, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0440, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0012, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0412, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0021, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0424, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0006, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0014, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0022, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0396, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0398, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0011, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0386, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0463, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0433, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0451, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0419, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0024, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0017, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0016, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0004, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0015, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0430, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0017, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0440, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0559, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0010, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0486, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0526, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0410, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0010, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0446, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0420, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0438, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0005, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0021, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0024, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0482, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0435, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0500, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0434, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0424, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0437, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0442, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0013, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0469, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0022, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0395, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0020, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0408, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0457, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0014, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0023, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0383, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0415, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0410, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0418, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0019, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0477, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0010, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0430, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0383, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0385, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0673, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0022, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0387, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0411, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0397, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0402, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0535, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0384, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0020, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0023, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0013, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0005, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0415, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0024, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0405, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0501, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0413, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0386, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0427, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0642, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0452, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0447, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0016, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0403, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0418, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0019, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0020, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0656, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0690, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0443, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0562, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0465, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0431, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0383, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0022, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0645, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0018, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0439, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0783, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0021, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0472, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0010, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0620, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0659, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0019, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0455, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0019, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0021, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0437, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0393, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0438, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0462, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0003, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0431, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0003, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0400, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0453, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0566, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0561, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0449, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0025, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0454, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0392, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0630, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0025, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0489, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0384, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0010, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0431, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0379, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0431, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0410, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0451, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0013, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0738, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0022, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0463, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0572, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0398, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0456, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0006, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0019, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0590, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0414, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0629, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0447, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0407, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0006, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0389, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0463, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0012, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0020, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0427, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0016, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0017, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0547, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0025, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0008, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0541, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0465, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0488, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0500, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0465, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0570, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0016, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0455, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0457, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0014, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0025, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0411, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0398, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0404, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0457, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0412, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0503, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0387, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0022, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0023, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0428, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0383, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0527, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0449, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0509, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0399, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0018, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0555, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0411, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0521, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0426, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0013, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0433, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0427, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0024, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0516, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0388, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0419, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0389, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0398, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0390, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0587, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0484, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0397, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0568, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0012, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0445, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0024, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0024, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0483, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0008, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0016, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0018, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0426, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0441, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0407, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0017, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0401, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0448, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0022, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0396, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0485, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0020, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0021, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0440, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0499, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0013, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0394, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0501, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0022, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0025, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0018, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0403, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0410, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0420, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0024, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0532, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0022, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0020, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0654, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0638, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0490, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0007, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0483, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0503, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0529, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0520, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0576, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0708, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0488, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0403, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0387, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0438, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0505, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0561, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0384, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0020, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0430, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0018, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0025, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0405, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0466, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0386, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0024, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0020, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0013, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0012, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0389, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0430, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0014, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0007, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0393, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0392, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0012, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0478, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0597, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0430, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0019, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0024, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0024, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0425, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0409, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0488, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0006, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0602, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0413, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0383, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0024, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0399, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0015, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0509, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0024, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0443, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0017, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0434, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0403, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0024, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0437, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0492, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0384, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0016, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0017, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0401, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0497, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0019, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0013, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0527, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0546, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0388, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0419, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0018, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0485, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0454, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0473, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0443, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0430, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0025, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0005, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0392, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0022, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0016, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0435, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0008, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0024, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0593, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0022, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0379, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0018, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0481, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0011, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0025, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0023, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0435, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0382, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0429, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0017, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0022, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0017, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0508, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0386, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0436, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0013, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0015, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0395, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0010, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0024, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0025, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0024, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0433, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0394, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0007, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0007, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0599, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0438, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0501, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0398, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0389, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0399, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0411, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0447, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0438, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0399, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0414, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0412, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0015, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0383, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0401, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0019, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0563, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0462, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0606, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0472, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0385, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0428, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0411, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0401, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0485, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0403, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0393, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0449, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0394, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0011, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0444, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0526, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0468, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0479, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0392, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0379, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0639, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0403, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0414, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0495, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0422, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0494, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0513, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0580, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0406, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0406, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0422, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0427, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0459, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0404, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0018, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0023, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0591, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0406, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0517, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0583, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0021, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0451, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0437, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0546, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0531, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0395, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0393, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0448, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0433, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0386, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0452, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0015, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0387, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0383, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0405, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0398, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0502, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0399, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0479, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0385, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0398, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0379, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0418, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0412, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0406, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0383, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0514, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0593, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0390, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0470, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0011, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0448, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0417, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0397, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0399, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0400, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0418, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0636, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0504, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0384, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0446, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0396, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0410, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0443, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0403, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0011, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0387, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0426, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0021, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0408, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0390, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0417, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0391, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0482, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0387, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0389, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0384, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0016, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0437, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0386, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0513, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0025, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0537, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0401, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0389, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0447, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0022, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0420, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0008, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0010, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0419, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0558, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0391, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0389, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0414, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0592, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0390, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0546, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0477, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0428, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0443, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0393, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0413, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0440, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0510, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0493, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0469, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0441, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0990, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0389, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0024, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0379, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0391, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0461, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0430, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0405, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0448, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0418, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0400, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0440, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0019, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0422, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0392, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0400, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0426, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0389, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0434, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0379, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0434, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0424, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0762, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0420, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0467, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0382, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0431, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0449, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0397, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0412, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0526, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0422, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0537, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0406, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0390, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0406, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0471, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0017, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0389, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0012, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0455, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0389, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0481, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0017, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0405, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0426, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0604, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0391, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0386, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0498, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0396, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0479, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0406, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0409, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0008, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0427, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0023, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0398, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0436, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0442, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0414, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0470, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0013, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0435, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0023, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0488, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0718, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0397, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0021, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0025, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0384, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0020, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0437, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0510, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0420, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0017, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0516, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0023, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0460, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0410, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0727, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0408, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0470, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0440, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0394, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0494, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0406, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0496, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0385, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0403, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0468, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0021, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0024, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0619, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0401, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0410, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0489, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0429, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0391, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0023, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0018, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0427, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0403, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0447, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0382, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0019, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0521, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0379, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0018, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0441, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0662, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0021, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0432, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0015, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0007, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0012, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0451, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0019, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0409, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0024, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0404, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0441, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0395, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0463, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0437, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0474, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0435, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0023, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0420, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0416, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0017, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0020, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0010, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0392, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0022, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0023, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0468, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0383, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0531, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0014, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0463, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0408, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0389, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0004, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0395, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0020, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0007, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0461, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0017, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(1.9082e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0004, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0388, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0414, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0420, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0404, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0017, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0012, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0406, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0023, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0023, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0015, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0466, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0487, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0018, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0014, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0024, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0411, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0021, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(2.0385e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(2.1740e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0003, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0006, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0411, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0398, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0436, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0015, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0024, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0411, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0419, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0455, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0007, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0012, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0018, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0025, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0407, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(1.8639e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0016, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0391, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0005, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(2.5847e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0007, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0010, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0012, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0544, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0404, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0389, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0010, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0013, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0023, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(1.9910e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0021, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0024, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0025, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0425, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0015, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0401, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(2.0730e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0390, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0403, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(1.9503e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0012, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0006, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0614, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0018, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0017, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0004, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0432, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0509, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0005, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(2.1960e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(2.0890e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0019, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0015, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0009, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0025, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0017, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0383, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0007, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0019, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0011, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0025, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(2.2321e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0016, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(2.2692e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(2.9918e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0023, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0016, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0002, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0394, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(1.9716e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(1.8937e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0007, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0016, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0484, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0432, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0461, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0024, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0467, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0418, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0013, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(2.0388e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0008, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(1.9934e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0002, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(2.0481e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0006, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0016, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0024, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0562, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0007, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0472, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0019, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0382, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(2.5340e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0014, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0025, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0450, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0024, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0001, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(2.0191e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(2.0798e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0005, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0016, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0001, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(1.9540e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0001, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(2.2363e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0018, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0013, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0003, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0518, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0398, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0021, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0480, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0814, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0447, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0025, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0403, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0024, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0009, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0007, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(2.0521e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0006, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0009, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0009, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0018, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(1.9930e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(1.9178e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0425, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0459, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0516, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0018, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0400, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0023, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0458, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0379, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0001, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0021, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0002, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0011, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0018, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0011, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0013, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0385, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0022, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0502, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0025, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0018, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0439, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0568, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0453, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0024, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0018, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0465, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0022, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0024, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0025, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0387, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0016, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0448, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0003, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0024, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0389, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0382, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0445, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0009, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0392, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0401, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0718, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0472, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0415, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0023, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0019, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0020, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0010, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0010, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0011, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0483, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0383, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0416, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0501, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0022, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0018, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0025, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0384, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0397, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0395, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0379, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0400, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0385, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0388, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0537, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0438, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0023, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0417, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0441, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0013, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0016, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0025, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0460, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0484, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0409, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0024, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0010, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0024, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0009, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0429, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0479, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0020, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0807, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0453, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0395, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0383, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0015, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0411, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0408, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0013, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0448, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0468, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0023, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0022, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0473, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0404, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0009, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0404, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0015, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(7.4947e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0003, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0585, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0396, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0409, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0423, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0021, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0009, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0013, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0016, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0007, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0396, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0025, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0025, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0009, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0439, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0409, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0577, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0545, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0388, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0019, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0014, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0486, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0005, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0005, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0018, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(2.7214e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0012, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0422, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0024, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0021, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0860, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0515, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0010, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0414, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0384, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0025, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0402, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0780, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0487, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0573, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0677, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0492, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0008, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0004, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0008, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0015, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0505, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0388, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0527, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0553, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0010, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0018, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0019, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(5.6869e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0020, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0002, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(1.8917e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(1.6592e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0011, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0013, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0001, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0018, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0542, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0023, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0476, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0014, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0548, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0415, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0002, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0014, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0012, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0519, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0415, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0010, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0415, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0428, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0409, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0408, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0408, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0010, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0020, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(2.0168e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0487, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0563, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0017, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0423, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0014, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0456, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0389, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0014, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0005, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0011, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0025, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0390, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(1.9045e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0011, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0402, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0024, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0016, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0405, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0397, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0386, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0579, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0450, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0522, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0386, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0556, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0416, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0395, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0025, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0666, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0024, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0382, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0435, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0386, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0013, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0018, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0449, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0485, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0412, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0399, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0412, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0412, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0385, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0540, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0495, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0017, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0014, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0535, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0383, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0408, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0012, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0683, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0404, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0018, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0486, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0530, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0412, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0409, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0701, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(2.7794e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0406, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0602, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0408, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0463, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(4.6100e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0385, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0668, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0479, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0399, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0408, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0025, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(2.2213e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0654, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0010, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0447, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0393, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(5.9296e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0003, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0007, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0533, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0022, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0387, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0407, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0022, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0493, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0474, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0024, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0415, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0021, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0018, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0472, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0016, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0478, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0396, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0428, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0013, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0017, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0446, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0493, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0025, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0382, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0011, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0004, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0395, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0437, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0009, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(2.3688e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0014, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0005, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0687, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0495, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0386, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0020, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0520, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0023, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0736, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0403, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0009, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0025, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0002, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0024, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0011, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0571, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0388, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0379, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0430, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0018, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0010, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0005, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0008, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0492, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0427, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0511, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0387, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0483, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(2.3591e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0025, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(7.0446e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0426, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0392, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0507, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0512, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0412, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0451, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0022, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0473, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0012, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0014, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0414, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0023, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0024, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0461, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0570, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0013, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0439, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0024, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0025, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0402, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0470, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0384, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0405, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0523, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0003, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0414, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0516, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0496, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0611, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0025, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0493, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0458, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0016, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0025, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0020, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0451, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0505, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0443, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0020, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0570, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0415, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0492, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0444, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0414, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0487, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0427, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0518, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0428, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0014, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0503, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0430, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0018, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0412, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0391, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0407, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0023, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0411, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0483, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0385, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0481, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0398, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0545, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0460, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0011, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0463, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0490, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0388, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0500, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0020, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0406, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0472, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0447, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0840, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0012, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0679, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0447, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0015, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0017, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0015, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0434, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0419, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0469, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0478, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0712, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0433, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0007, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0017, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0409, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0430, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0466, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0020, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0012, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0395, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0439, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0020, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0505, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0025, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0019, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0010, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0416, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0456, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0527, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0560, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0647, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0011, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0439, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0025, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0786, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0384, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0426, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0527, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0406, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0391, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0448, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0467, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0455, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0005, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0521, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0384, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0397, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0446, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0411, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0439, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0001, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0482, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0618, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0558, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0515, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0414, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0426, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0459, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0006, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0016, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0422, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0423, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0413, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0391, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0459, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0393, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0019, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0401, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0455, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0488, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0021, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0390, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0379, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0018, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0382, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0433, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0425, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0427, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0565, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0424, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0449, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0664, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0011, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0017, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0461, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0425, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0024, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0428, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0024, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0579, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0017, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0488, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0016, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0023, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0426, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0021, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0379, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0577, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0384, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0379, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0419, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0392, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0454, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0411, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0385, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0015, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0022, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0397, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0423, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0022, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0392, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0016, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0016, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0477, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0008, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0393, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0479, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0542, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0476, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0453, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0019, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0383, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0401, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0407, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0382, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0396, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0408, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0024, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0402, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0390, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0456, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0593, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0393, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0455, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0408, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0440, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0395, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0416, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0410, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0522, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0562, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0023, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0446, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0517, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0007, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0388, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0579, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0408, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0473, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0513, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0497, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0475, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0385, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0442, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0011, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0404, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0385, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0379, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0385, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0485, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0023, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0428, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0379, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0430, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0418, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0025, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0390, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0423, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0014, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0024, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0019, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0750, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0385, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0379, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0025, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0438, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0013, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0402, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0452, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0021, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0012, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0409, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0024, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0441, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0531, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0023, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0421, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0484, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0011, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0444, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0402, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0404, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0679, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0025, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0555, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0423, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0523, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0017, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0441, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0588, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0443, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0574, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0554, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0002, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0480, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0592, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0873, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0427, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0398, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0441, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0530, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0006, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0022, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0596, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0444, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0403, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0580, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0420, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0459, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0449, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0022, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0536, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0025, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0392, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0417, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0015, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0022, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0404, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0015, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0022, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0451, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0468, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0411, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0437, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0385, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0510, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0466, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0024, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0402, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0469, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0592, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0396, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0387, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0009, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0384, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0488, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0454, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0009, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0008, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0445, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0017, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0533, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0025, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0017, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0014, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0515, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0023, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0025, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0402, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0434, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0392, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0011, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0022, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0418, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0023, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0017, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0025, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0506, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0404, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0017, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0435, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0418, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0710, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0543, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0551, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0446, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0014, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0002, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0693, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0024, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0018, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0385, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0018, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0427, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0019, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0480, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0009, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0402, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0390, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0021, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0016, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0397, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0442, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0024, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0498, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0391, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0025, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0449, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0559, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0023, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0008, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0398, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0016, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0392, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0021, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0005, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0010, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0414, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0389, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0465, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0414, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0401, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0477, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0397, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0020, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0769, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0391, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0520, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0585, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0386, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0024, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0023, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0484, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0400, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0432, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0556, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0504, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0468, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0478, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0459, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0479, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0493, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0403, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0464, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0023, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0391, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0022, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0504, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0437, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0449, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0389, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0384, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0455, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0010, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0009, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0542, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0883, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0458, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0023, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0613, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0490, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0535, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0410, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0017, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0598, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0394, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0017, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0394, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0012, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0400, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0414, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0492, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0020, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0397, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0384, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0021, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0736, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0448, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0014, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0483, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0471, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0023, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0013, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0481, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0012, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0433, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0022, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0016, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0016, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0414, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0518, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0015, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0401, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0438, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0022, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0025, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0481, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0429, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0502, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0417, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0016, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0003, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0002, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0021, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0014, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0022, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0431, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0024, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0391, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0007, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0583, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0627, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0004, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0394, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0405, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0020, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0473, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0487, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0536, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0479, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0024, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0017, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0023, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0011, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0021, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0020, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0017, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0418, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0022, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0382, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0658, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0487, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0434, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0401, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0012, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0401, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0434, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0021, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0011, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0399, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0016, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(1.2852e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0020, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0025, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0653, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0426, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0580, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0012, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0020, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0536, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0015, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0013, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0411, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0412, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0382, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0410, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0023, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0432, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0383, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0422, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0427, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0023, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0425, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0023, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0489, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0018, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0007, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0025, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0002, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0008, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0015, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0024, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0003, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0009, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0511, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0024, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0471, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0025, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(6.3504e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0015, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0009, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0021, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0384, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0003, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0447, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0023, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0547, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0383, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0765, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0384, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(1.3101e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0022, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0022, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0493, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0017, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0383, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0407, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0390, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0449, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0389, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(3.1814e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0001, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0010, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0009, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0014, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0007, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0003, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0018, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(8.3960e-06, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0016, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0518, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0009, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0001, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0011, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0015, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0435, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0013, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0529, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0025, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0015, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0606, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0010, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0014, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0022, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0469, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0467, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0392, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0401, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0015, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0010, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0410, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0383, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0016, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0018, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0017, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0458, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0013, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0015, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0439, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0018, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0020, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0410, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0022, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0526, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0017, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0011, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0430, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0382, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0460, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0633, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0404, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0633, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(4.2075e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0010, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0022, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0003, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(4.4946e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0021, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0009, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0409, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0590, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0408, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0024, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0649, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0022, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0022, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0021, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0384, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0025, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0002, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0023, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0017, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0014, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0009, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0009, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0009, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(1.7316e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0004, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0009, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0003, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0016, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0001, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0394, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0411, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0007, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0454, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0004, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0625, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0020, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0012, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0656, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0596, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0422, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(1.7389e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0443, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0393, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0012, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0435, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0450, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0020, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0400, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0498, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0008, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0021, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0532, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0005, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0448, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0018, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0418, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0010, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0017, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0019, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0022, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0004, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0002, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0001, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0025, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0004, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(7.9814e-06, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0007, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0021, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0015, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0002, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0022, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0023, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0025, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(1.0800e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0007, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0015, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0022, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0020, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(2.1297e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0003, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0008, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(1.2328e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0022, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0021, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0024, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0020, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0023, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0002, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(4.1471e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0013, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0005, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(1.1035e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0013, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0002, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0004, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0011, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0015, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0013, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0001, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0428, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0012, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0021, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0012, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(3.2970e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(7.8144e-06, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(7.8144e-06, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(7.8126e-06, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0022, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0397, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0023, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(1.4306e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0021, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0428, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0025, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0411, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0524, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0431, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0524, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0397, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0024, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0017, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0016, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0023, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0022, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(1.5612e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0021, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0011, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0007, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0015, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0003, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(2.0377e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0001, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0022, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0022, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0011, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0004, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(6.5126e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(1.2230e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0019, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0009, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0009, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0013, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0004, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(7.8264e-06, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0015, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(3.4802e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0012, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0009, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0014, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0006, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0002, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0016, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(2.0484e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0017, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0017, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0012, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0005, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(1.0104e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0014, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(1.6460e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0019, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(2.4389e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0003, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(5.8698e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0010, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0014, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0005, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0010, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0013, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0007, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0001, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0012, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0010, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0016, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0012, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(1.8051e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0006, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0003, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0022, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0021, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0023, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0019, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(9.6366e-06, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(6.6504e-06, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0002, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(8.2006e-06, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(7.1590e-06, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0011, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(9.3455e-06, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(3.5990e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0006, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0004, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(7.2414e-06, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0387, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0557, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0001, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0001, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0001, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0001, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0001, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0001, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0001, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0001, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0001, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0001, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0001, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0001, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0001, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0002, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0025, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0470, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0023, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0021, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0386, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0019, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0019, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0483, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0394, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0467, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0015, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0536, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0019, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0021, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0014, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0019, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0007, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0014, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0023, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0004, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0014, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0015, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0016, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0018, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0025, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0017, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0014, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0020, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0022, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(3.0124e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0003, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0008, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0003, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0025, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0018, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0002, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0019, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0002, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0007, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0023, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0024, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0007, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0005, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0025, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0006, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0008, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0006, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0018, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0419, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0005, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0014, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0002, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0003, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0023, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0021, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0006, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0004, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0004, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0014, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0022, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0458, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0382, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0012, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(4.9793e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0011, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0487, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0025, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0004, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0015, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0011, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0407, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0524, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0003, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0008, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(8.4443e-06, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0005, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0014, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(9.9917e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0018, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(4.1862e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0025, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(7.7811e-06, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0025, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0007, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0011, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0011, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0009, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0011, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0007, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0008, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(3.2538e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0004, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0004, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0009, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0006, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0020, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0492, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0017, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0025, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0385, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0390, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0007, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0005, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0405, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0543, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0488, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0427, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0019, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0453, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0022, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0015, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0420, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0023, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0516, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0005, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0486, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0469, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0392, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0418, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0010, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0022, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0002, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0019, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0020, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0022, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0002, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0019, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0023, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0405, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0386, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0018, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0009, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0394, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0410, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0432, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0631, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0015, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0025, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0414, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0494, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0418, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0024, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0016, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0019, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0022, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0023, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0011, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0761, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0387, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0398, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0023, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0397, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0417, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0408, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0019, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0516, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0447, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0454, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0448, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0450, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0458, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0648, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0475, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0006, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0024, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0013, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0021, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0413, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0024, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0019, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0009, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0014, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0477, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0410, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0386, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0410, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0021, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0392, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0436, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0421, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0023, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0423, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0455, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0430, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0418, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0460, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0023, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0448, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0022, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0022, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0396, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0419, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0564, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0021, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0407, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0464, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0523, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0021, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0405, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0421, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0501, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0466, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0387, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0392, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0542, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0440, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0468, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0390, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0025, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0534, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0647, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0400, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0011, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0500, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0443, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0389, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0443, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0441, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0024, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0428, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0536, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0012, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0025, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0018, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0658, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0388, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0411, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0387, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0021, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0022, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0460, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0004, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0015, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0401, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0629, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0011, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0024, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0470, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0453, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0013, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0395, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0007, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0387, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0018, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0405, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0023, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0428, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0003, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0511, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0015, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0006, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0025, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0896, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0388, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0021, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0015, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0024, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0024, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0012, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0023, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0400, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0017, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0013, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0470, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0389, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0025, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0785, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0003, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0011, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0450, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0415, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0020, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0025, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0018, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0412, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0025, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0023, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0383, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0406, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0411, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0414, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0017, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0430, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0388, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0017, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0013, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0465, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0023, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0013, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0024, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0021, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0392, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0536, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0020, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0024, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0022, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0015, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0023, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0425, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0392, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0011, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0390, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0010, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(3.5063e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0010, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0431, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0020, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0005, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0379, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0417, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0025, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0382, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0473, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0464, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0440, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0399, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0015, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0484, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0429, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0014, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0475, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0425, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0571, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0019, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0466, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0418, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0010, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0405, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0017, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0018, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0022, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0024, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0005, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0446, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0403, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0399, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0020, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0388, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0594, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0527, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0012, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0025, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0498, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0022, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0010, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0011, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0415, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0420, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0526, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0382, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0412, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0024, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0011, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0505, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0024, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0024, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0388, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0398, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0017, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0014, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0023, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0433, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0383, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0395, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0395, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0017, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0013, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0386, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0021, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0015, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0020, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0390, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0484, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0009, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0443, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0653, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0018, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0387, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0013, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0398, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0018, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0006, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0452, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0014, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0023, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0022, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0023, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0580, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0024, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0427, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0022, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0428, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0017, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0016, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0025, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0017, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0412, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0015, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0414, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0023, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0020, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0399, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0623, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0408, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0398, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0016, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0477, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0024, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0401, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0390, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0004, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0024, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0021, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0387, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0019, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0488, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0420, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0436, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0020, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0021, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0025, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0021, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0396, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0011, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0607, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0022, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0406, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0016, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0396, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0024, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0022, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0021, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0554, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0401, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0018, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0411, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0023, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0006, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0405, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0008, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0018, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0016, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0015, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0425, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0388, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0004, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0391, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0010, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0018, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0016, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0022, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0019, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0428, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0396, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0025, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0015, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0024, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0387, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0395, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0402, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0016, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0019, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0006, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0022, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0441, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0025, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0021, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0011, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0421, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0008, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0014, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0020, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0018, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0025, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0468, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0013, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0004, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0466, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0009, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0021, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0022, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0024, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0025, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0418, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0007, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0004, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0018, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0441, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0022, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0629, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0022, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0408, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0023, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0437, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0013, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0415, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0021, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0011, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0007, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0461, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0006, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0384, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0432, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0008, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0019, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0022, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0017, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0014, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0023, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0023, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0017, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0021, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0024, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0002, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0022, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0384, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0016, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0011, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0010, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0019, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0019, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0009, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0014, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(1.6723e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(4.9523e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0002, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0006, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0018, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0484, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0005, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0009, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0002, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0024, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0025, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0382, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0018, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0012, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0025, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0019, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0010, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0020, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(7.1290e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0023, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0002, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0023, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0004, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0008, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0007, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0025, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0025, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0019, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0023, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0025, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0020, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0024, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0011, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0469, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0463, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0019, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0019, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0002, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0013, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0018, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0466, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0010, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0451, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0011, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0005, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0018, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0006, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0004, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0008, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0567, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0024, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0017, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0024, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0018, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0015, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0385, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0014, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0011, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0008, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0004, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(2.7464e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(2.5558e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0007, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(1.8235e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(1.4469e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(1.5071e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0004, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0008, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0004, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(1.4394e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0002, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(1.9940e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(1.5902e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0004, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0008, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(1.9940e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0002, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(2.2778e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0002, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(1.6740e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(3.2833e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(1.5526e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0002, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0023, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(2.1834e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(1.8987e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(2.3872e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(1.6903e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(2.2155e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(1.4697e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(1.9204e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(1.5151e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(2.2775e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0004, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(1.4818e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(1.4780e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(1.6822e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(1.5206e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0002, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(1.7670e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(1.6114e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(1.7444e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(2.3170e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(1.7187e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(1.7847e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(1.5279e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(1.5863e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(1.4412e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0002, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(3.8869e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(3.1657e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0002, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(2.5180e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(1.4397e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(1.9892e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0008, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(2.0771e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(3.9676e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(1.7366e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(1.8038e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0008, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(1.6476e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(1.4131e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(1.3811e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(1.6698e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(1.8824e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0013, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(1.3576e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(1.7788e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(4.1004e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0007, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(1.6027e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0004, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(2.3352e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(1.3756e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(1.6362e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(1.8928e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(1.5457e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(1.6756e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(1.5179e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0001, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0004, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(2.9090e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(1.4419e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(2.4327e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0004, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(1.3731e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(1.4460e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0001, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0004, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0004, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0009, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(1.7802e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0001, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0003, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(1.5756e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0001, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0001, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0025, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0001, device='cuda:0', grad_fn=<NllLossBackward>)
--------------EPOCH 9-------------
Test Accuracy: tensor(0.3525, device='cuda:0')
Loss: tensor(0.0771, device='cuda:0', grad_fn=<NllLossBackward>)
Saving model at iteration: 0
Loss: tensor(0.0395, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0394, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0385, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0419, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0508, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0425, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0391, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0659, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0617, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0728, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0403, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0431, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0522, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0467, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0475, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0389, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0773, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0427, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0438, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0510, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0421, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0595, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0630, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0442, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0467, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0478, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0387, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0500, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0398, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0434, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0526, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0562, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0431, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0410, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0502, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0391, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0536, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0532, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0415, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0445, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0672, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0398, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0397, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0704, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0451, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0529, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0409, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0509, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0498, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0523, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0403, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0467, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0470, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0526, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0383, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0422, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0499, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0470, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0434, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0845, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0441, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0428, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0494, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0529, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0399, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0454, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0394, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0491, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0439, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0523, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0591, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0421, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0403, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0460, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0478, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0577, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0478, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0401, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0501, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0565, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0387, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0401, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0559, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0392, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0415, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0506, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0382, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0415, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0532, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0477, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0425, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0394, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0539, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0418, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0399, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0394, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0388, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0452, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0524, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0440, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0392, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0396, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0642, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0384, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0482, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0706, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0447, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0437, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0465, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0487, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0457, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0461, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0516, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0395, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0512, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0445, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0415, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0464, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0392, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0417, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0387, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0385, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0477, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0455, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0515, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0474, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0696, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0388, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0524, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0734, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0416, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0693, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0522, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0551, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0562, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0407, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0470, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0529, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0459, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0406, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0423, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0581, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0528, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0382, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0401, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0661, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0389, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0490, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0415, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0478, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0526, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0510, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0523, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0509, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0457, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0424, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0460, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0510, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0713, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0465, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0515, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0597, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0438, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0524, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0401, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0413, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0475, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0650, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0539, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0424, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0398, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0395, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0414, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0671, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0379, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0457, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0451, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0412, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0585, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0411, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0510, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0491, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0388, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0565, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0379, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0430, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0631, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0424, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0398, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0425, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0438, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0510, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0443, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0568, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0463, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0532, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0537, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0595, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0383, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0431, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0393, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0416, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0457, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0579, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0439, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0587, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0437, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0524, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0416, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0621, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0502, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0587, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0453, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0805, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0908, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0533, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0396, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0441, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0425, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0426, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0395, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0462, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0449, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0651, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0506, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0823, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0454, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0469, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0464, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0472, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0574, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0419, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0412, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0465, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0513, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0391, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0390, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0387, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0464, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0419, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0780, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0562, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0389, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0407, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0414, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0436, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0481, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0518, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0471, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0436, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0459, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0421, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0428, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0418, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0453, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0431, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0423, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0485, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0451, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0429, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0429, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0767, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0519, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0446, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0596, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0393, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0412, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0448, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0423, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0444, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0469, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0646, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0486, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0389, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0441, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0498, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0383, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0654, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0433, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0489, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0458, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0384, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0557, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0511, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0702, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0541, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0407, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0539, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0388, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0462, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0534, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0436, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0462, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0479, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0411, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0535, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0409, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0439, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0560, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0502, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0571, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0509, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0600, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0666, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0717, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0502, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0811, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0454, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0633, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0528, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0447, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0429, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0508, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0397, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0514, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0424, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0495, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0463, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0696, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0457, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0426, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0696, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0468, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0530, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0558, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0418, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0489, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0382, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0386, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0412, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0466, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0405, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0439, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0583, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0412, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0473, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0438, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0691, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0555, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0515, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0395, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0413, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0632, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0410, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0816, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0449, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0387, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0391, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0544, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0414, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0407, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0447, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0463, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0535, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0455, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0497, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0480, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0470, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0450, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0556, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0453, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0467, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0436, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0521, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0390, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0400, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0418, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0389, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0402, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0435, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0408, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0651, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0504, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0454, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0454, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0420, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0784, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0545, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0456, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0534, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0572, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0415, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0546, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0448, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0396, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0511, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0391, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0512, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0454, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0524, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0400, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0604, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0415, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0526, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0388, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0566, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0468, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0412, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0507, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0420, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0603, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0427, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0414, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0409, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0477, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0436, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0413, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0559, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0483, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0398, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0664, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0535, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0494, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0623, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0712, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0467, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0496, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0409, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0471, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0438, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0544, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0422, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0451, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0441, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0448, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0433, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0423, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0468, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0587, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0409, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0390, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0405, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0632, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0461, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0399, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0453, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0398, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0422, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0536, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0397, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0384, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0465, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0504, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0395, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0525, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0492, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0602, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0395, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0496, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0460, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0544, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0471, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0420, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0422, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0446, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0434, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0715, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0645, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0387, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0392, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0383, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0430, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0453, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0401, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0450, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0444, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0501, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0564, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0398, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0515, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0408, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0440, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0708, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0491, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0532, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0408, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0408, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0474, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0581, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0443, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0439, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0534, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0429, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0437, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0492, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0533, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0748, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0392, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0396, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0620, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0419, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0444, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0499, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0509, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0431, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0436, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0413, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0424, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0426, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0499, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0457, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0457, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0443, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0520, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0402, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0483, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0395, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0546, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0655, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0538, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0450, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0452, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0480, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0590, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0387, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0420, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0747, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0439, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0408, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0706, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0386, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0482, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0414, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0449, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0406, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0486, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0400, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0410, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0382, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0731, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0671, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0530, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0620, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0410, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0579, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0412, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0401, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0519, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0382, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0608, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0479, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0449, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0487, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0574, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0525, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0469, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0540, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0394, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0413, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0426, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0587, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0458, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0382, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0458, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0599, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0467, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0574, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0474, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0512, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0605, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0506, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0567, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0394, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0457, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0406, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0549, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0473, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0426, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0733, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0390, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0455, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0403, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0439, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0413, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0534, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0405, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0450, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0398, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0390, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0847, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0851, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0421, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0393, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0501, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0518, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0456, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0615, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0508, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0456, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0672, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0415, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0428, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0562, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0505, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0400, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0558, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0485, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0634, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0667, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0436, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0410, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0550, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0405, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0395, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0412, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0566, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0531, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0498, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0414, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0455, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0466, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0386, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0432, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0457, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0399, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0434, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0568, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0419, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0517, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0469, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0516, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0417, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0525, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0435, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0625, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0507, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0516, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0705, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0450, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0428, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0538, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0471, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0420, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0523, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0470, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0682, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0441, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0403, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0490, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0455, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0524, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0442, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0386, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0490, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0556, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0791, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0633, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0592, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0406, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0394, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0392, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0630, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0912, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0437, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0448, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0564, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0506, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0403, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0685, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0382, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0430, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0454, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0441, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0692, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0395, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0505, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0439, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0484, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0401, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0403, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0505, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0465, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0451, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0397, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0551, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0436, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0420, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0535, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0493, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0390, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0483, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0392, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0420, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0562, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0441, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0388, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0392, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0484, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0392, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0402, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0483, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0419, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0453, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0467, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0433, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0394, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0728, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0422, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0442, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0520, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0647, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0406, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0427, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0457, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0429, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0551, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0425, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0399, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0631, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0550, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0413, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0451, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0614, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0896, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0433, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0486, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0398, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0388, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0548, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0459, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0583, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0410, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0530, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0407, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0624, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0551, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0705, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0420, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0546, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0492, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0441, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0385, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0604, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0412, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0426, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0382, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0386, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0445, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0383, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0687, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0434, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0509, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0523, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0537, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0400, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0496, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0505, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0536, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0499, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0479, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0405, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0421, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0565, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0613, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0620, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0491, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0430, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0531, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0396, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0396, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0500, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0505, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0439, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0438, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0390, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0448, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0459, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0419, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0488, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0700, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0677, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0489, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0534, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0407, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0478, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0518, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0692, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0384, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0406, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0437, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0406, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0469, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0450, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0549, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0437, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0435, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0395, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0399, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0384, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0519, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0387, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0709, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0438, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0500, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0424, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0436, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0416, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0394, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0564, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0448, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0479, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0447, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0646, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0641, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0458, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0419, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0413, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0413, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0483, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0561, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0581, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0436, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0796, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0442, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0413, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0450, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0427, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0412, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0404, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0426, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0590, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0409, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0537, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0448, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0389, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0432, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0450, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0398, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0456, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0757, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0411, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0666, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0651, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0411, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0484, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0392, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0395, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0467, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0498, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0469, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0701, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0468, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0464, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0523, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0436, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0605, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0393, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0489, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0549, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0388, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0439, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0397, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0560, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0505, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0676, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0433, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0552, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0567, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0452, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0532, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0393, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0396, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0525, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0415, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0382, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0379, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0537, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0395, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0404, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0565, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0454, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0610, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0651, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0534, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0458, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0520, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0439, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0510, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0512, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0547, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0509, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0421, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0411, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0431, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0466, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0435, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0409, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0397, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0554, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0425, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0429, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0669, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0387, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0424, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0787, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0682, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0403, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0519, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0419, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0536, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0458, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0511, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0519, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0401, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0410, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0427, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0392, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0555, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0410, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0614, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0406, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0389, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0410, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0532, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0466, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0436, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0623, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0386, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0692, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0394, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0565, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0549, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0441, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0427, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0444, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0393, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0457, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0406, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0571, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0386, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0672, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0478, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0433, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0438, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0421, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0573, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0496, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0444, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0426, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0389, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0515, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0566, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0401, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0407, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0449, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0548, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0435, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0485, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0412, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0418, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0433, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0498, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0625, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0436, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0390, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0477, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0584, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0409, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0592, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0449, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0456, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0422, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0524, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0471, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0490, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0610, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0423, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0404, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0438, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0457, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0390, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0405, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0482, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0471, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0442, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0525, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0407, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0409, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0501, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0553, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0431, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0528, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0451, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0468, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0529, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0460, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0483, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0501, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0666, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0501, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0402, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0449, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0398, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0455, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0428, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0387, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0418, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0423, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0410, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0423, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0393, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0450, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0613, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0542, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0536, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0507, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0453, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0830, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0640, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0506, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0417, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0414, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0392, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0385, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0388, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0473, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0441, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0470, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0590, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0511, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0603, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0513, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0578, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0435, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0542, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0549, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0398, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0504, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0410, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0433, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0387, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0514, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0403, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0488, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0390, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0404, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0467, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0512, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0426, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0442, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0596, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0439, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0397, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0437, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0430, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0758, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0389, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0497, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0433, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0481, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0426, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0406, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0571, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0392, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0460, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0657, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0385, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0657, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0384, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0448, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0518, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0444, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0386, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0611, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0473, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0698, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0563, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0511, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0432, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0492, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0547, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0400, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0428, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0419, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0449, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0430, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0575, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0520, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0494, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0436, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0446, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0382, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0563, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0444, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0400, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0392, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0508, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0410, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0582, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0399, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0460, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0434, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0388, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0420, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0500, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0450, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0401, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0994, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0452, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0384, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0391, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0580, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0389, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0441, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0432, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0510, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0452, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0543, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0498, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0413, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0388, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0441, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0416, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0466, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0525, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0562, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0448, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0530, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0392, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0549, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0638, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0425, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0485, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0440, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0486, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0486, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0433, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0414, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0392, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0467, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0547, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0428, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0391, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0492, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0419, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0393, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0524, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0447, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0416, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0438, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0449, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0514, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0481, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0488, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0549, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0473, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0391, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0385, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0629, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0434, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0619, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0535, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0529, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0531, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0942, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0402, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0665, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0553, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0384, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0404, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0578, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0772, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0398, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0392, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0424, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0462, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0436, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0419, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0569, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0415, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0817, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0589, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0387, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0447, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0448, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0401, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0567, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0453, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0540, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0514, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0399, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0534, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0546, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0444, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0676, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0439, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0547, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0441, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0414, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0488, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0462, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0397, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0642, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0480, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0509, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0391, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0387, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0482, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0391, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0411, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0830, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0485, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0392, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0488, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0435, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0737, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0513, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0445, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0787, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0391, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0542, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0568, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0423, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0384, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0562, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0612, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0570, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0408, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0399, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0433, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0389, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0705, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0411, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0426, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0502, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0519, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0520, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0565, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0445, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0459, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0430, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0404, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0516, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0454, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0398, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0408, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0767, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0404, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0487, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0394, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0430, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0475, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0402, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0639, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0543, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0551, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0434, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0689, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0386, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0423, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0512, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0571, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0490, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0433, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0403, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0509, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0386, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0421, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0390, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0493, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0386, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0508, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0438, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0459, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0461, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0415, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0467, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0440, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0456, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0461, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0702, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0464, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0432, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0432, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0746, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0465, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0514, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0471, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0502, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0388, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0432, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0610, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0540, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0404, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0532, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0680, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0602, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0446, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0391, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0768, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0426, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0390, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0391, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0475, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0409, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0397, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0447, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0525, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0426, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0571, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0465, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0656, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0633, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0495, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0488, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0588, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0453, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0428, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0446, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0413, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0521, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0526, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0439, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0388, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0472, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0387, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0407, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0395, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0443, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0474, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0417, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0406, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0542, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0536, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0500, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0407, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0411, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0509, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0462, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0597, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0522, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0525, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0416, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0444, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0797, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0835, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0457, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0435, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0500, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0400, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0413, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0430, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0507, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0471, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0596, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0420, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0456, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0427, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0548, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0493, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0411, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0499, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0440, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0582, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0437, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0388, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0448, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0500, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0576, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0518, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0501, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0615, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0447, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0484, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0498, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0490, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0522, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0476, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0409, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0863, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0716, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0554, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0390, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0467, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0518, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0602, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0410, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0388, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0469, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0507, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0396, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0407, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0401, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0387, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0438, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0406, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0595, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0489, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0423, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0483, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0453, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0404, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0509, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0563, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0439, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0429, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0687, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0395, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0403, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0430, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0737, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0482, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0644, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0496, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0496, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0414, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0387, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0401, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0447, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0578, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0419, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0536, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0435, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0686, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0403, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0622, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0510, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0517, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0440, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0437, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0433, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0470, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0379, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0455, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0394, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0431, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0394, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0408, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0798, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0386, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0401, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0526, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0640, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0432, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0602, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0607, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0396, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0398, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0508, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0676, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0411, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0492, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0406, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0442, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0406, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0405, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0443, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0410, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0402, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0587, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0438, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0523, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0565, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0419, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0441, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0424, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0467, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0379, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0459, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0455, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0405, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0394, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0559, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0679, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0425, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0550, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0793, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0422, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0610, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0606, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0013, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0498, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0414, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0521, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0414, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0450, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0398, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0567, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0672, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0486, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0432, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0443, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0686, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0557, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0493, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0393, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0383, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0467, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0472, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0414, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0533, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0610, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0471, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0884, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0454, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0379, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0535, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0417, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0524, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0505, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0562, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0422, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0464, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0411, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0413, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0387, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0509, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0483, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0465, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0395, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0664, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0512, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0419, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0448, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0412, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0518, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0645, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0387, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0503, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0441, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0486, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0447, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0493, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0518, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0579, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0389, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0418, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0520, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0884, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0435, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0396, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0704, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0693, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0419, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0385, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0401, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0566, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0474, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0652, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0382, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0928, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0395, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0705, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0473, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0420, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0604, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0553, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0399, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0382, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0426, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0387, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0401, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0445, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0485, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0492, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0591, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0395, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0383, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0480, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0487, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0418, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0457, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0430, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0408, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0398, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0499, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0635, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0525, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0465, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0393, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0520, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0493, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0394, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0392, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0548, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0589, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0515, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0417, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0413, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0532, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0519, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0432, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0410, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0467, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0385, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0606, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0483, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0495, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0402, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0603, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0670, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0458, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0469, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0564, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0382, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0471, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0404, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0398, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0394, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0382, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0436, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0445, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0404, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0618, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0424, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0442, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0419, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0422, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0540, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0440, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0715, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0488, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0515, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0556, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0393, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0415, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0653, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0484, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0478, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0591, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0445, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0755, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0539, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0581, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0567, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0631, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0510, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0562, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0421, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0396, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0512, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0382, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0409, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0478, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0427, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0630, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0427, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0423, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0383, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0017, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0390, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0466, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0397, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0379, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0428, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0543, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0501, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0646, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0465, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0542, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0423, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0428, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0387, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0468, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0556, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0411, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0390, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0433, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0425, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0582, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0434, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0499, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0531, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0475, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0462, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0529, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0548, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0550, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0697, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0430, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0415, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0450, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0460, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0536, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0379, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0449, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0492, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0473, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0541, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0393, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0423, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0430, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0430, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0405, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0392, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0449, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0388, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0384, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0521, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0409, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0414, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0520, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0649, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0467, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0701, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0393, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0538, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0406, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0542, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0419, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0503, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0484, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0437, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0468, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0452, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0387, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0461, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0449, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0575, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0401, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0496, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0444, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0504, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0538, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0491, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0401, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0488, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0429, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0405, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0395, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0573, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0660, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0416, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0432, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0539, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0769, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0422, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0645, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0446, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0411, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0392, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0383, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0588, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0461, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0427, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0435, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0567, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0417, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0582, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0690, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0450, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0462, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0397, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0470, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0455, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0436, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0540, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0478, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0514, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0395, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0538, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0394, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0448, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0614, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0404, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0625, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0478, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0475, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0481, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0534, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0458, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0435, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0643, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0439, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0478, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0626, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0420, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0698, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0451, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0479, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0635, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0548, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0669, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0423, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0803, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0410, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0390, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0442, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0570, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0447, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0413, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0515, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0468, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0845, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0393, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0650, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0400, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0426, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0640, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0426, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0474, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0398, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0469, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0392, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0433, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0427, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0441, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0483, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0438, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0430, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0410, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0411, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0402, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0410, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0701, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0445, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0511, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0406, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0602, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0408, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0405, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0407, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0489, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0024, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0392, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0433, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0559, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0422, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0400, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0518, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0400, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0395, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0425, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0390, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0443, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0506, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0426, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0475, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0595, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0429, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0517, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0384, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0425, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0395, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0416, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0445, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0445, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0562, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0383, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0396, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0603, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0395, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0446, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0421, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0390, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0575, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0474, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0407, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0592, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0383, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0497, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0440, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0493, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0414, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0412, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0476, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0387, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0397, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0422, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0456, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0483, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0462, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0416, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0390, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0005, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0454, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0428, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0480, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0407, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0479, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0458, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0472, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0524, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0579, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0440, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0400, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0383, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0423, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0014, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0522, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0442, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0390, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0417, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0593, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0399, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0429, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0400, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0468, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0446, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0389, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0443, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0406, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0418, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0386, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0660, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0393, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0406, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0382, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0471, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0420, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0470, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0411, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0392, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0418, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0562, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0443, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0462, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0564, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0401, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0430, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0401, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0422, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0446, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0438, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0466, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0479, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0413, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0398, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0464, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0475, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0574, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0528, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0481, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0534, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0677, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0609, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0435, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0410, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0390, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0023, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0529, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0540, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0450, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0413, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0445, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0385, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0441, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0406, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0484, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0470, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0460, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0410, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0393, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0453, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0531, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0421, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0676, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0486, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0401, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0385, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0390, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0474, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0399, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0456, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0383, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0384, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0408, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0383, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0447, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0391, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0399, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0470, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0398, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0512, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0432, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0463, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0422, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0465, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0403, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0396, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0386, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0593, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0406, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0385, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0536, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0401, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0538, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0541, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0458, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0876, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0480, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0385, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0537, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0588, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0482, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0391, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0384, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0434, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0560, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0387, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0479, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0396, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0412, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0384, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0016, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0424, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0536, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0466, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0459, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0440, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0436, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0460, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0394, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0467, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0382, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0018, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0624, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0391, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0450, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0425, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0492, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0014, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0454, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0441, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0489, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0502, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0390, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0512, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0434, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0398, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0420, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0391, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0396, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0494, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0388, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0427, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0504, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0475, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0019, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0496, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0017, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0397, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0402, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0611, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0410, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0429, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0500, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0009, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0409, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0485, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0386, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0527, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0402, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0386, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0428, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0437, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0383, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0386, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0439, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0545, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0692, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0016, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0405, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0415, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0390, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0504, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0459, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0439, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0393, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0530, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0838, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0486, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0446, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0392, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0012, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0019, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0447, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0020, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0438, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0402, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0386, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0418, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0635, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0428, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0434, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0528, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0662, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0014, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0390, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0387, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0379, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0387, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0002, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0008, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0527, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0023, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0402, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0495, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0550, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0489, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0692, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0470, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0397, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0005, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0396, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0025, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0024, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0461, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0456, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0417, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0023, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0426, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0631, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0465, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0021, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0008, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0022, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0011, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0419, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0390, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0537, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0604, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0391, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0404, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0024, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0588, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0403, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0542, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0386, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0432, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0475, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0436, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0453, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0383, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0495, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0021, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0470, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0422, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0403, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0016, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0016, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0426, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0004, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0510, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0443, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0022, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0410, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0410, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0025, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0379, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0010, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0023, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0386, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0511, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0016, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0007, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0010, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0403, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0012, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0399, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0396, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0481, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0520, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0014, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0482, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0390, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0004, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0012, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0403, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0024, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(2.1687e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0406, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0404, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0012, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0019, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0433, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0392, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0016, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0023, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0009, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0495, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0389, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0012, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0403, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0390, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0455, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0408, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0463, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0392, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0023, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0023, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0018, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0509, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0023, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0449, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0016, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0021, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0406, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0466, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0409, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0009, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0025, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0385, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0017, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0399, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0005, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0011, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0023, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0021, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0383, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0025, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0009, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0441, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0394, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0441, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0399, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0022, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0014, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0015, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0003, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0011, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0391, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0016, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0023, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0427, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0522, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0008, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0438, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0455, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0009, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0396, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0388, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0413, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0004, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0020, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0022, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0467, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0409, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0454, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0402, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0421, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0024, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0407, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0011, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0443, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0020, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0018, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0452, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0010, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0020, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0394, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0016, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0446, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0009, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0025, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0023, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0025, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0607, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0017, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0022, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0391, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0397, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0494, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0018, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0021, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0013, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0004, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0400, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0020, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0479, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0022, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0403, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0385, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0603, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0443, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0415, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0015, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0017, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0019, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0644, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0023, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0668, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0412, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0023, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0536, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0387, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0417, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0023, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0020, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0575, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0018, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0751, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0018, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0463, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0009, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0024, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0567, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0592, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0018, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0432, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0016, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0018, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0023, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0404, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0395, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0446, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0002, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0023, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0414, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0002, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0451, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0539, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0497, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0431, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0023, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0023, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0021, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0402, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0590, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0019, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0475, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0009, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0382, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0419, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0390, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0399, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0011, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0682, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0024, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0018, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0433, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0554, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0383, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0413, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0006, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0016, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0556, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0572, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0414, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0390, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0005, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0024, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0438, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0010, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0019, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0387, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0012, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0015, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0524, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0020, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0007, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0499, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0444, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0465, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0483, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0403, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0514, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0014, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0408, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0379, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0024, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0013, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0023, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0426, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0383, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0410, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0019, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0018, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0393, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0485, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0406, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0505, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0016, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0535, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0397, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0468, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0394, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0011, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0408, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0411, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0023, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0440, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0397, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0534, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0466, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0391, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0478, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0010, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0423, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0019, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0024, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0022, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0457, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0007, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0023, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0014, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0018, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0025, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0423, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0015, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0455, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0019, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0399, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0020, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0018, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0418, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0459, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0012, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0445, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0022, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0022, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0016, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0393, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0395, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0023, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0430, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0019, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0025, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0019, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0614, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0589, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0482, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0006, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0425, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0439, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0488, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0496, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0539, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0638, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0021, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0440, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0023, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0404, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0534, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0018, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0390, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0012, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0023, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0447, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0018, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0016, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0012, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0009, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0021, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0399, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0012, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0006, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0010, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0443, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0560, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0016, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0020, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0022, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0401, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0389, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0428, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0006, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0543, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0387, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0019, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0384, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0010, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0437, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0021, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0413, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0012, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0419, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0022, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0414, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0013, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0015, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0025, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0390, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0015, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0011, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0477, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0519, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0382, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0024, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0015, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0433, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0452, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0439, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0412, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0021, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0004, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0019, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0014, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0408, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0024, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0008, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0023, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0543, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0020, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0023, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0015, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0449, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0009, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0023, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0018, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0395, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0023, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0016, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0019, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0015, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0467, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0404, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0012, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0014, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0022, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0008, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0022, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0022, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0023, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0025, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0024, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0022, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0019, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0404, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0006, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0006, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0575, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0410, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0458, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0425, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0423, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0403, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0014, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0387, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0016, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0555, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0411, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0547, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0435, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0022, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0393, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0401, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0454, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0404, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0009, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0411, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0487, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0419, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0448, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0636, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0023, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0387, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0469, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0391, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0480, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0493, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0544, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0386, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0415, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0416, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0424, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0022, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0016, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0021, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0558, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0390, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0500, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0540, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0022, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0417, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0417, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0527, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0502, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0022, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0385, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0412, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0422, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0012, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0394, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0383, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0390, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0498, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0024, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0450, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0383, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0024, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0404, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0479, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0556, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0025, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0021, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0447, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0008, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0448, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0401, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0392, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0520, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0483, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0408, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0019, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0423, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0389, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0009, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0406, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0018, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0382, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0023, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0462, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0015, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0025, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0025, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0394, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0486, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0021, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0507, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0426, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0018, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0389, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0006, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0009, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0407, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0488, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0399, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0560, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0510, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0449, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0408, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0407, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0023, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0406, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0494, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0478, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0435, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0390, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0907, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0022, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0426, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0420, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0383, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0430, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0389, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0425, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0025, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0018, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0402, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0422, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0405, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0713, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0430, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0397, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0446, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0382, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0384, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0509, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0410, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0516, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0451, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0015, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0025, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0011, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0432, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0025, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0480, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0016, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0385, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0595, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0471, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0407, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0391, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0007, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0414, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0021, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0414, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0422, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0383, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0445, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0024, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0021, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0010, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0408, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0022, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0022, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0471, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0682, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0385, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0019, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0018, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0016, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0379, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0467, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0400, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0014, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0494, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0022, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0023, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0021, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0420, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0674, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0388, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0024, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0023, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0421, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0422, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0416, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0482, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0431, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0019, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0022, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0025, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0603, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0388, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0020, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0462, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0410, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0024, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0021, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0017, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0415, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0418, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0024, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0018, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0488, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0025, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0015, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0409, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0594, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0019, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0402, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0014, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0025, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0007, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0011, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0425, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0015, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0020, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0402, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0392, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0423, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0430, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0457, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0409, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0020, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0394, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0412, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0016, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0016, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0009, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0020, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0021, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0439, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0488, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0023, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0014, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0455, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0021, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0004, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0018, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0006, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0452, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0016, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(1.8006e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0003, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0400, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0388, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0386, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0023, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0015, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0011, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0022, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0021, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0012, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0449, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0449, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0014, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0024, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0011, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0022, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0390, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0019, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(1.9161e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(2.0961e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0003, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0005, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0023, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0013, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0022, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0392, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0402, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0412, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0007, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0009, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0016, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0025, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(1.7740e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0014, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0004, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(2.5921e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0006, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0010, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0010, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0489, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0025, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0008, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0012, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0025, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0019, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(1.8834e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0019, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0022, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0020, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0022, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0386, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0015, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(1.9706e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(1.8603e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0011, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0005, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0540, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0014, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0016, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0003, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0400, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0452, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0004, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(2.0601e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(2.0018e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0016, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0012, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0006, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0023, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0016, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0006, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0025, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0016, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0022, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0024, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0010, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0023, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(2.1422e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0015, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(2.2043e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(2.9084e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0021, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0015, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0002, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(1.8787e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(1.7954e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0007, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0014, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0469, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0396, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0439, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0024, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0021, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0431, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0009, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0021, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(1.9481e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0007, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(1.9035e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0024, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0002, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(1.9562e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0005, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0017, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0023, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0021, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0524, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0006, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0450, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0017, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(2.4219e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0012, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0023, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0415, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0018, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0001, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(1.9322e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(1.9888e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0004, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0013, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(7.8386e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(1.8660e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0001, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(2.1538e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0016, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0013, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0003, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0463, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0399, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0019, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0021, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0023, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0430, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0024, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0025, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0022, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0022, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0025, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0023, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0756, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0404, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0020, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0023, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0389, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0022, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0007, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0006, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(1.9639e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0005, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0007, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0021, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0008, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0014, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(1.8944e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(1.8249e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0389, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0451, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0452, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0014, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0019, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0424, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0001, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0016, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0001, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0009, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0017, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0010, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0011, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0020, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0022, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0456, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0024, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0020, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0015, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0025, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0384, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0514, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0023, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0414, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0025, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0023, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0016, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0409, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0018, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0020, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0019, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0014, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0432, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0003, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0020, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0409, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0008, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0024, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0634, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0414, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0020, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0403, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0019, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0013, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0020, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0008, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0008, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0010, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0430, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0450, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0018, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0014, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0021, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0021, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0023, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0025, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0468, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0025, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0399, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0019, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0407, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0011, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0014, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0022, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0442, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0430, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0023, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0391, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0019, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0008, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0021, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0007, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0022, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0025, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0452, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0024, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0016, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0023, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0714, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0022, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0409, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0012, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0388, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0382, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0010, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0436, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0413, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0021, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0017, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0423, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0008, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0014, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(8.2547e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0002, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0515, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0025, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0388, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0398, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0023, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0019, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0008, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0010, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0017, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0007, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0022, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0024, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0008, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0416, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0549, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0495, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0015, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0013, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0023, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0025, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0022, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0469, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0005, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0005, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0015, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(2.7228e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0010, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0391, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0021, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0014, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0793, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0464, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0008, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0398, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0020, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0711, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0023, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0414, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0503, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0023, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0619, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0457, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0006, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0004, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0025, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0007, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0024, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0013, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0024, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0464, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0025, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0494, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0516, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0008, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0015, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0018, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(6.3968e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0018, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0002, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(1.9091e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(1.6543e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0011, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0014, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0001, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0016, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0501, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0009, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0462, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0013, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0517, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0002, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0013, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0010, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0486, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0387, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0010, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0409, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0395, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0008, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0017, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(2.0404e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0022, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0432, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0544, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0025, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0014, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0398, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0012, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0423, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0024, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0010, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0004, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0011, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0022, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(1.8611e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0011, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0018, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0019, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0015, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0516, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0397, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0490, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0538, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0022, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0024, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0614, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0019, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0407, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0010, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0016, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0420, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0429, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0515, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0447, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0015, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0012, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0502, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0020, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0011, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0607, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0397, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0024, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0015, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0466, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0453, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0388, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0616, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(2.3592e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0384, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0552, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0457, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(4.3194e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0609, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0472, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0385, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0021, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0024, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(1.8947e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0602, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0009, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0395, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(5.5608e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0003, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0007, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0477, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0020, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0025, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0025, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0017, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0408, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0439, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0022, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0411, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0024, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0019, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0016, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0437, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0016, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0421, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0401, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0013, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0015, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0399, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0025, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0469, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0020, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0022, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0008, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0004, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0024, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0025, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0433, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0025, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0022, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0023, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0008, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(2.0695e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0011, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0005, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0584, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0022, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0465, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0015, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0443, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0023, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0676, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0008, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0023, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0002, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0019, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0010, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0535, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0422, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0016, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0023, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0009, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0004, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0006, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0484, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0389, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0449, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0385, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0462, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(2.1067e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0021, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(6.7801e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0408, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0496, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0456, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0393, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0408, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0023, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0021, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0439, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0013, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0014, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0020, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0399, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0021, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0020, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0444, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0023, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0543, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0012, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0412, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0022, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0022, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0475, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0023, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0483, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0002, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0384, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0512, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0474, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0550, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0022, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0461, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0426, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0016, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0020, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0019, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0408, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0487, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0414, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0016, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0023, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0532, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0025, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0025, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0383, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0023, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0451, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0394, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0022, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0395, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0440, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0406, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0022, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0493, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0396, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0013, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0451, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0420, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0013, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0024, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0024, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0020, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0438, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0024, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0433, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0507, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0426, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0010, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0450, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0024, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0461, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0025, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0425, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0024, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0018, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0383, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0444, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0401, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0791, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0010, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0564, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0407, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0013, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0015, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0013, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0410, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0386, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0419, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0403, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0668, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0421, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0005, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0014, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0413, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0022, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0390, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0017, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0009, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0386, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0017, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0024, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0020, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0429, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0022, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0022, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0016, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0008, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0023, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0023, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0398, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0502, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0473, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0636, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0023, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0008, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0379, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0020, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0749, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0404, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0021, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0455, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0398, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0410, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0022, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0453, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0432, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0004, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0472, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0432, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0408, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0392, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0001, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0021, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0455, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0587, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0537, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0481, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0025, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0402, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0408, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0427, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0004, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0013, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0385, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0421, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0016, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0432, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0464, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0017, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0025, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0017, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0394, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0396, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0500, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0406, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0561, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0010, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0013, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0021, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0430, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0020, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0398, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0020, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0514, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0014, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0457, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0013, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0020, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0409, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0017, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0559, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0023, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0423, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0388, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0012, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0017, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0391, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0020, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0016, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0013, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0443, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0006, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0420, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0519, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0436, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0432, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0014, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0020, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0423, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0516, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0435, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0406, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0384, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0484, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0504, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0019, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0434, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0480, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0005, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0534, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0388, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0445, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0024, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0469, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0468, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0437, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0412, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0009, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0389, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0025, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0425, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0018, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0419, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0415, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0394, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0023, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0023, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0406, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0011, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0016, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0017, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0675, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0020, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0408, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0011, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0025, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0427, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0016, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0009, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0397, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0022, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0391, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0025, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0487, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0021, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0397, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0440, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0008, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0422, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0386, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0382, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0632, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0021, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0507, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0024, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0410, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0484, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0015, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0025, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0400, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0553, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0390, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0526, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0537, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0002, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0431, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0583, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0800, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0401, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0404, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0473, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0005, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0015, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0568, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0422, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0534, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0411, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0389, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0418, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0020, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0500, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0022, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0387, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0013, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0019, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0013, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0020, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0405, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0416, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0391, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0415, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0444, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0409, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0021, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0435, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0591, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0007, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0456, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0417, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0008, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0007, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0403, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0014, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0492, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0022, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0015, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0013, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0468, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0025, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0016, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0020, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0411, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0009, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0022, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0018, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0391, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0023, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0016, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0023, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0431, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0022, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0014, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0410, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0407, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0677, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0521, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0532, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0424, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0024, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0010, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0001, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0654, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0021, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0013, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0024, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0014, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0412, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0017, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0462, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0018, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0008, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0379, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0016, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0012, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0417, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0021, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0477, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0025, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0023, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0435, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0515, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0016, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0007, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0025, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0013, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0024, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0019, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0018, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0005, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0008, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0379, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0025, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0025, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0455, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0024, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0391, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0016, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0698, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0500, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0516, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0022, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0021, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0454, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0386, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0423, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0519, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0453, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0416, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0456, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0021, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0422, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0469, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0463, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0420, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0022, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0019, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0477, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0411, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0391, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0412, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0024, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0009, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0007, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0525, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0830, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0412, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0021, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0590, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0452, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0512, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0392, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0015, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0023, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0576, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0025, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0015, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0011, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0478, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0021, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0017, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0022, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0020, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0023, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0677, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0424, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0014, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0474, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0467, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0020, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0011, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0445, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0011, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0024, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0023, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0019, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0013, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0015, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0447, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0013, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0018, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0025, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0407, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0395, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0461, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0014, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0003, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0002, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0020, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0008, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0020, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0405, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0019, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0006, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0551, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0546, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0004, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0022, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0019, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0435, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0469, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0521, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0446, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0021, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0013, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0022, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0009, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0020, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0015, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0013, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0388, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0017, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0023, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0608, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0023, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0453, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0384, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0394, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0010, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0424, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0020, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0009, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0020, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0013, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(1.2655e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0017, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0021, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0580, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0389, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0512, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0010, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0016, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0505, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0021, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0014, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0011, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0024, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0393, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0387, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0018, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0022, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0409, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0405, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0404, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0019, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0389, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0021, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0020, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0016, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0005, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0025, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0024, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0001, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0007, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0011, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0018, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0002, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0008, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0494, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0019, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0435, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0020, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(5.9124e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0013, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0007, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0017, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0025, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0002, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0021, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0428, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0020, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0529, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0023, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0686, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0024, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(1.3046e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0020, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0019, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0460, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0014, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0437, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(3.7439e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0001, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0007, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0007, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0013, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0005, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0003, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0013, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0024, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(8.1078e-06, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0013, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0503, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0008, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(8.3689e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0010, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0009, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0395, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0011, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0455, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0023, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0023, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0014, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0540, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0009, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0024, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0012, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0018, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0432, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0025, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0437, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0382, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0013, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0023, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0023, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0009, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0398, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0013, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0015, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0014, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0428, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0010, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0013, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0422, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0013, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0017, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0018, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0473, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0016, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0009, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0021, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0396, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0446, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0583, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0598, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0023, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(3.8203e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0010, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0020, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0003, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(4.1483e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0025, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0017, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0006, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0554, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0021, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0576, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0020, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0022, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0019, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0020, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0023, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0387, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0022, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0002, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0021, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0015, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0011, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0007, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0006, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0007, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(1.6985e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0004, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0008, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0002, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0014, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0001, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0379, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0006, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0434, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0004, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0565, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0019, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0011, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0022, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0600, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0524, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0408, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(1.6500e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0023, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0403, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0009, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0399, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0429, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0018, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0463, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0007, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0019, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0449, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0004, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0407, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0017, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0024, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0390, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0010, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0017, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0016, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0019, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0003, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0002, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0015, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(9.2689e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0020, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0003, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(8.0674e-06, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0025, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0007, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0024, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0020, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0011, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0002, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0022, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0018, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0017, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(1.0873e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0006, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0013, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0017, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0018, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(1.8700e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0003, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0008, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(1.2085e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0022, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0019, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0019, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0020, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0022, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0002, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(3.0607e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0009, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0004, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(1.0956e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0023, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0009, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0022, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0002, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0004, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0010, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0013, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0024, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0011, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(7.2806e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0400, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0010, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0015, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0006, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(3.2683e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(7.8843e-06, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(7.8843e-06, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(7.8865e-06, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0025, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0018, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0022, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0020, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(1.4021e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0020, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0389, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0019, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0022, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0446, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0021, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0402, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0485, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0379, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0020, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0020, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0014, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0024, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0014, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0017, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0022, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0021, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(1.4527e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0015, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0010, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0021, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0006, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0024, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0014, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0002, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(1.8573e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0021, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(9.7020e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0019, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0020, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0008, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0025, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0025, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0004, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(5.6247e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(1.2623e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0020, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0016, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0008, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0008, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0011, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0004, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(7.5428e-06, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0012, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(3.3899e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0011, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0007, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0015, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0005, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0001, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0012, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(1.8065e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0015, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0012, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0010, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0004, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(1.0024e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0011, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(1.5143e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0017, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(2.0722e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0003, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(4.8118e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0008, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0012, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0004, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0010, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0011, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0006, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(8.6530e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0022, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0011, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0009, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0013, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0011, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(1.7793e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0004, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0002, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0018, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0024, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0019, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0019, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0015, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(9.3166e-06, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(6.5705e-06, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0002, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(7.9863e-06, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(6.9539e-06, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0010, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(9.2976e-06, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(3.6609e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0005, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0022, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0003, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(7.1864e-06, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0502, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0001, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0001, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0001, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0001, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0001, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0001, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0001, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0001, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0001, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0001, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0001, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0001, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0001, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0001, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0023, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0425, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0021, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0022, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0025, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0015, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0015, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0413, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0454, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0013, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0477, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0017, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0024, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0020, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0023, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0011, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0017, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0025, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0006, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0012, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0021, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0003, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0014, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0014, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0012, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0016, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0020, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0012, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0010, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0017, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0019, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0024, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(2.9415e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0002, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0007, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0024, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0021, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0003, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0023, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0025, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0025, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0016, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0002, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0015, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0002, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0006, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0020, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0019, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0007, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0022, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0005, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0022, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0020, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0022, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0004, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0007, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0017, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0005, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0017, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0003, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0013, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0002, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0002, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0022, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0018, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0006, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0003, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0003, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0011, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0024, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0020, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0400, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0011, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(4.8817e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0023, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0010, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0463, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0022, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0024, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0003, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0013, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0010, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0481, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0003, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0008, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(8.4000e-06, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0004, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0011, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(8.4772e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0013, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(3.5088e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0020, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(7.5013e-06, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0021, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0006, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0009, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0008, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0007, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0009, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0006, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0008, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(2.8686e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0003, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0025, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0003, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0009, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0005, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0020, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0025, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0457, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0016, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0025, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0024, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0007, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0004, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0379, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0483, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0460, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0387, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0016, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0435, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0018, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0024, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0021, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0014, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0019, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0476, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0004, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0424, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0447, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0392, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0008, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0021, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0025, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0001, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0018, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0017, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0019, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0001, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0019, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0021, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0020, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0391, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0017, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0008, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0384, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0390, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0632, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0012, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0023, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0379, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0446, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0023, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0024, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0022, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0014, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0020, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0013, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0018, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0019, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0009, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0021, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0697, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0019, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0383, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0017, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0022, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0461, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0400, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0025, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0415, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0410, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0419, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0025, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0427, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0582, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0460, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0005, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0024, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0019, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0011, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0018, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0021, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0016, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0008, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0011, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0024, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0427, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0385, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0386, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0017, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0410, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0025, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0406, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0019, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0023, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0409, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0432, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0422, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0022, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0409, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0412, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0022, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0021, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0404, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0025, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0025, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0020, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0020, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0024, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0420, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0021, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0544, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0021, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0020, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0384, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0431, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0023, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0478, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0017, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0395, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0025, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0025, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0446, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0450, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0024, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0514, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0398, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0432, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0023, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0024, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0022, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0495, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0629, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0009, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0465, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0397, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0413, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0399, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0019, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0021, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0384, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0487, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0012, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0019, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0023, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0016, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0022, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0584, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0013, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0019, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0425, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0003, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0012, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0570, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0009, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0015, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0427, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0422, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0012, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0379, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0006, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0016, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0384, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0018, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0408, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0003, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0482, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0012, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0005, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0021, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0020, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0830, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0017, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0014, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0018, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0022, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0007, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0015, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0016, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0011, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0449, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0023, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0669, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0003, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0009, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0401, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0384, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0016, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0020, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0015, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0024, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0023, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0022, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0389, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0384, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0402, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0012, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0402, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0379, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0014, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0025, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0010, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0456, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0021, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0012, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0021, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0019, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0492, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0018, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0021, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0020, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0013, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0022, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0023, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0019, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0023, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0425, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0382, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0007, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0010, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(3.3390e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0008, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0385, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0018, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0004, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0391, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0020, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0463, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0424, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0416, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0012, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0439, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0430, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0013, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0435, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0023, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0024, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0409, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0541, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0016, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0024, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0008, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0021, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0014, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0015, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0021, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0022, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0004, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0414, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0391, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0023, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0557, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0504, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0022, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0012, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0021, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0465, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0021, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0010, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0024, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0010, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0412, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0404, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0481, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0021, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0387, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0023, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0009, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0469, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0020, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0022, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0014, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0013, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0022, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0424, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0016, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0010, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0020, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0019, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0012, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0020, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0482, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0009, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0413, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0595, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0017, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0021, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0022, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0011, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0022, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0016, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0005, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0024, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0438, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0012, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0020, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0021, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0018, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0521, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0022, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0017, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0410, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0021, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0024, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0395, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0025, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0015, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0015, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0022, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0013, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0386, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0019, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0014, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0019, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0025, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0017, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0585, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0387, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0022, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0013, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0451, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0021, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0956, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0004, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0021, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0018, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0019, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0447, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0391, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0023, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0399, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0021, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0023, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0016, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0025, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0019, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0016, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0009, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0580, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0017, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0015, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0021, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0019, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0016, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0511, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0388, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0025, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0016, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0417, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0020, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0005, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0007, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0023, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0016, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0014, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0013, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0023, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0416, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0003, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0008, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0016, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0025, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0014, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0020, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0016, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0416, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0390, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0024, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0022, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0012, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0022, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0014, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0015, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0005, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0021, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0395, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0021, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0025, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0021, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0018, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0025, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0010, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0393, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0007, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0012, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0017, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0015, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0022, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0436, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0011, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0004, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0449, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0021, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0008, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0017, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0018, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0022, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0022, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0023, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0024, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0397, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0023, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0006, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0004, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0016, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0410, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0019, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0024, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0614, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0020, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0022, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0022, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0411, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0010, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0025, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0392, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0019, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0009, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0022, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0024, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0007, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0449, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0006, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0023, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0406, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0024, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0008, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0013, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0017, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0021, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0011, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0022, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0010, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0020, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0017, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0024, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0013, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0015, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0022, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0002, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0019, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0022, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0016, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0009, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0010, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0024, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0021, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0018, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0016, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0007, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0012, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(1.6133e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(5.0115e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0001, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0005, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0017, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0459, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0005, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0008, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0022, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0002, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0024, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0022, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0022, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0019, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0015, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0011, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0020, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0016, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0008, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0015, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(6.6184e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0010, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0001, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0016, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0003, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0007, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0006, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0020, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0020, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0025, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0016, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0018, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0020, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0019, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0020, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0024, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0009, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0431, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0022, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0025, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0407, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0014, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0016, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0002, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0012, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0024, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0021, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0015, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0445, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0009, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0417, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0009, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0025, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0004, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0016, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0004, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0023, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0003, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0024, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0007, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0490, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0023, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0014, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0022, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0018, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0015, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0014, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0384, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0009, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0010, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0007, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0004, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(2.6994e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(2.3379e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0007, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(1.7477e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(1.3470e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(1.4107e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0003, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0007, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0003, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(1.3396e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0001, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(1.8945e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(1.4891e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0003, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0007, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(1.8620e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0001, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(2.2965e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0002, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(1.5435e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(3.2727e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(1.4564e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0002, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0019, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(1.9404e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(1.8518e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(2.2220e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(1.5710e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(2.0060e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(1.3739e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(1.7573e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(1.4240e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(2.0995e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0003, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(1.3882e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(1.3870e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(1.5646e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(1.4142e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0002, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(1.6905e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(1.5078e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(1.6663e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(2.2510e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(1.6646e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(1.7697e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(1.4553e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(1.4857e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(1.3554e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0001, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(3.6410e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(3.1886e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0001, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(2.4389e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(1.3541e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(1.9285e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0006, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(2.0395e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(3.4260e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(1.6576e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(1.6812e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0007, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(1.5488e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(1.3338e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(1.2949e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(1.5696e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(1.7155e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0011, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(1.2771e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(1.6871e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(3.5988e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0006, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(1.5050e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0004, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(2.4179e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(1.3049e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(1.5821e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(1.7697e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(1.4771e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(1.6045e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(1.3977e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0001, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0003, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(2.7561e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(1.3598e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(2.2135e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0003, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(1.2973e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(1.3297e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0001, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0003, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0003, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0008, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(1.8101e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0001, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0002, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(1.5023e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0001, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0001, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0023, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0001, device='cuda:0', grad_fn=<NllLossBackward>)
--------------EPOCH 10-------------
Test Accuracy: tensor(0.3525, device='cuda:0')
Loss: tensor(0.0676, device='cuda:0', grad_fn=<NllLossBackward>)
Saving model at iteration: 0
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0433, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0420, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0622, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0551, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0670, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0396, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0464, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0424, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0426, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0666, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0384, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0382, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0463, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0396, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0560, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0558, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0422, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0492, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0456, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0415, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0455, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0523, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0491, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0463, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0452, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0387, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0644, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0666, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0402, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0472, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0454, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0437, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0477, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0405, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0414, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0497, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0386, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0474, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0417, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0404, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0769, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0410, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0450, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0503, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0397, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0484, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0419, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0558, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0410, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0416, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0465, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0532, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0482, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0394, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0462, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0513, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0496, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0388, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0436, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0392, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0396, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0490, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0391, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0495, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0400, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0500, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0399, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0592, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0446, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0625, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0407, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0395, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0415, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0438, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0396, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0409, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0446, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0461, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0415, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0385, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0433, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0445, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0427, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0480, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0443, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0642, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0470, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0646, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0651, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0462, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0514, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0499, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0440, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0489, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0417, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0400, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0551, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0503, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0383, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0642, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0425, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0393, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0436, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0492, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0478, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0487, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0465, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0440, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0427, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0421, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0648, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0391, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0458, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0578, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0397, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0493, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0439, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0610, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0494, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0386, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0387, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0634, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0452, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0402, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0561, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0402, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0467, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0468, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0517, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0557, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0388, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0389, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0396, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0389, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0460, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0402, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0537, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0499, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0493, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0544, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0446, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0542, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0401, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0557, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0408, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0502, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0552, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0468, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0523, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0406, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0762, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0845, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0500, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0392, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0406, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0389, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0407, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0382, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0592, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0455, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0776, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0401, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0427, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0430, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0422, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0558, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0397, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0404, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0400, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0473, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0426, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0385, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0743, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0505, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0384, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0422, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0453, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0475, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0437, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0419, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0390, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0406, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0398, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0438, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0396, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0410, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0446, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0404, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0402, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0410, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0682, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0496, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0403, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0510, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0427, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0401, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0387, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0435, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0585, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0419, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0410, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0454, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0611, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0417, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0463, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0432, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0499, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0475, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0656, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0540, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0384, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0496, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0439, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0504, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0403, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0421, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0447, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0382, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0496, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0382, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0397, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0520, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0478, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0524, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0459, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0582, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0609, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0681, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0404, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0781, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0412, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0611, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0489, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0440, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0407, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0490, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0383, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0488, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0388, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0434, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0425, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0621, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0456, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0391, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0664, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0444, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0507, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0501, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0386, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0460, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0414, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0385, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0410, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0523, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0449, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0417, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0631, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0514, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0485, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0386, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0604, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0721, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0410, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0520, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0389, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0391, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0416, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0417, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0492, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0426, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0444, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0451, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0424, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0414, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0489, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0421, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0436, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0399, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0509, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0383, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0419, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0379, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0611, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0474, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0415, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0390, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0389, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0735, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0505, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0433, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0502, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0556, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0400, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0506, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0422, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0480, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0468, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0394, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0454, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0555, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0486, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0515, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0394, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0469, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0562, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0394, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0449, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0399, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0506, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0466, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0607, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0506, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0442, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0604, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0667, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0439, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0448, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0390, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0445, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0412, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0479, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0386, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0430, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0413, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0431, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0427, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0433, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0548, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0604, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0418, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0425, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0383, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0507, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0442, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0499, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0407, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0504, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0428, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0566, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0394, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0442, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0413, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0524, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0444, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0429, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0408, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0656, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0604, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0420, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0402, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0383, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0402, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0419, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0463, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0566, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0456, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0394, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0397, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0660, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0471, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0499, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0457, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0562, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0418, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0395, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0482, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0397, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0390, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0461, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0521, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0687, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0564, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0402, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0421, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0476, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0461, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0406, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0398, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0399, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0393, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0404, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0491, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0439, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0413, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0419, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0477, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0452, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0524, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0621, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0490, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0433, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0403, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0461, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0556, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0391, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0677, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0422, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0662, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0455, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0395, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0423, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0397, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0409, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0689, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0582, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0487, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0532, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0550, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0484, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0553, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0454, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0425, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0446, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0547, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0496, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0422, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0496, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0554, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0427, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0434, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0546, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0418, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0548, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0419, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0463, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0572, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0479, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0532, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0392, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0489, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0405, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0421, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0676, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0436, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0383, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0426, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0469, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0413, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0809, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0804, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0433, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0476, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0440, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0589, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0498, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0408, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0656, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0397, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0411, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0532, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0490, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0505, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0448, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0597, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0621, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0421, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0389, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0537, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0393, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0379, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0548, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0522, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0476, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0407, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0424, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0408, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0443, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0412, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0518, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0385, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0384, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0466, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0396, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0495, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0399, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0513, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0400, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0584, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0455, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0484, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0673, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0435, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0405, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0492, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0416, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0410, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0522, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0412, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0650, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0392, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0447, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0430, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0492, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0404, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0431, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0512, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0773, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0596, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0558, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0388, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0551, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0847, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0424, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0421, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0506, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0459, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0660, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0416, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0421, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0384, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0659, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0476, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0424, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0430, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0382, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0476, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0408, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0384, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0520, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0384, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0392, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0484, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0424, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0427, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0399, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0537, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0408, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0394, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0465, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0445, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0403, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0445, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0396, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0379, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0703, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0387, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0428, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0466, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0594, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0383, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0430, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0393, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0532, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0414, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0546, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0525, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0390, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0413, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0567, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0858, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0418, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0471, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0514, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0459, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0551, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0386, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0490, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0588, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0508, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0674, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0385, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0506, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0459, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0542, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0382, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0410, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0419, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0649, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0402, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0480, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0494, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0492, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0478, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0458, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0527, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0448, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0456, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0395, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0413, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0543, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0540, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0599, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0458, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0429, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0505, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0431, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0483, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0423, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0404, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0429, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0425, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0421, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0467, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0643, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0597, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0433, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0459, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0382, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0467, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0481, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0655, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0395, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0430, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0442, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0406, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0502, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0410, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0415, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0972, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0481, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0640, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0421, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0472, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0399, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0415, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0393, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0527, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0396, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0443, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0412, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0590, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0593, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0422, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0390, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0383, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0408, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0494, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0503, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0734, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0432, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0396, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0382, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0387, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0397, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0556, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0384, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0503, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0423, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0419, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0418, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0441, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0711, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0389, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0628, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0622, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0387, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0476, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0422, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0464, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0451, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0655, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0431, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0379, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0509, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0406, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0570, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0447, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0515, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0425, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0526, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0475, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0644, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0407, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0509, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0541, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0429, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0508, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0486, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0516, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0390, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0488, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0412, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0559, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0608, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0484, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0405, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0476, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0405, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0500, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0472, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0510, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0488, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0405, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0427, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0400, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0504, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0417, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0421, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0626, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0406, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0745, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0635, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0500, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0396, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0494, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0444, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0479, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0447, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0379, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0485, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0575, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0406, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0392, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0472, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0443, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0408, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0557, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0665, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0522, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0514, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0398, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0386, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0410, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0418, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0545, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0652, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0436, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0421, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0411, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0536, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0440, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0439, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0383, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0383, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0457, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0536, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0397, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0403, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0518, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0408, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0434, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0386, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0417, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0456, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0581, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0391, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0458, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0559, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0547, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0440, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0398, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0414, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0499, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0406, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0480, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0565, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0393, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0418, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0416, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0386, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0436, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0439, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0431, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0491, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0384, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0483, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0530, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0420, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0487, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0440, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0433, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0481, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0419, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0478, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0481, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0626, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0454, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0411, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0405, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0404, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0404, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0387, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0402, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0387, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0401, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0574, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0507, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0503, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0469, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0423, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0796, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0566, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0476, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0420, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0384, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0457, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0415, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0429, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0544, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0472, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0550, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0472, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0563, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0394, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0522, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0530, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0448, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0406, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0504, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0445, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0384, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0446, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0477, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0384, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0406, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0548, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0416, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0433, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0410, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0710, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0493, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0397, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0455, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0533, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0432, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0630, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0651, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0418, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0469, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0585, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0448, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0670, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0524, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0466, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0382, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0477, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0500, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0404, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0422, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0399, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0531, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0496, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0453, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0430, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0419, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0505, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0434, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0475, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0565, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0400, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0404, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0399, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0435, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0414, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0944, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0417, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0551, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0420, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0475, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0412, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0491, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0481, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0379, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0406, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0447, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0515, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0530, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0436, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0496, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0522, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0582, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0404, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0469, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0420, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0454, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0421, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0402, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0433, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0517, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0404, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0379, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0455, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0407, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0459, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0439, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0401, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0423, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0449, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0486, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0465, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0460, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0517, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0437, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0615, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0440, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0605, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0514, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0505, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0508, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0824, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0634, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0530, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0541, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0725, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0448, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0419, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0403, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0504, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0395, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0747, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0560, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0405, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0420, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0550, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0409, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0508, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0502, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0515, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0508, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0423, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0628, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0406, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0487, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0404, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0409, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0450, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0425, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0390, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0635, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0445, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0493, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0472, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0750, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0446, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0449, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0411, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0683, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0478, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0408, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0744, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0491, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0522, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0386, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0517, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0549, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0529, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0398, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0395, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0629, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0399, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0450, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0493, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0484, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0517, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0420, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0449, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0388, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0388, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0450, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0434, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0384, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0730, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0382, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0427, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0403, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0444, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0565, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0480, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0537, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0407, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0649, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0385, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0488, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0519, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0447, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0413, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0476, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0470, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0483, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0400, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0442, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0408, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0390, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0458, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0403, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0453, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0433, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0658, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0424, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0386, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0405, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0728, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0439, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0474, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0428, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0471, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0395, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0592, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0504, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0485, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0664, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0570, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0412, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0736, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0388, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0438, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0385, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0416, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0507, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0387, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0533, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0418, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0617, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0604, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0479, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0471, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0531, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0439, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0413, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0424, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0471, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0510, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0406, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0430, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0385, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0425, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0403, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0503, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0482, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0507, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0403, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0479, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0436, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0552, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0459, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0511, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0400, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0406, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0778, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0786, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0408, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0387, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0485, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0383, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0383, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0407, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0465, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0453, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0570, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0411, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0404, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0418, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0509, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0472, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0404, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0478, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0413, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0529, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0424, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0395, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0450, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0550, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0479, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0468, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0593, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0415, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0448, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0457, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0457, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0464, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0448, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0388, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0821, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0683, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0504, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0426, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0490, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0561, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0445, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0446, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0410, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0566, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0429, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0389, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0454, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0413, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0463, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0553, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0413, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0385, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0652, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0406, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0689, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0426, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0620, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0458, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0481, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0400, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0383, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0432, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0540, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0398, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0467, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0417, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0650, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0394, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0602, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0469, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0486, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0421, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0379, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0416, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0429, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0431, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0412, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0379, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0722, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0486, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0603, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0383, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0574, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0575, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0384, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0384, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0478, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0629, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0387, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0471, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0383, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0446, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0389, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0432, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0379, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0550, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0420, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0497, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0532, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0387, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0420, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0379, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0417, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0422, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0397, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0383, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0511, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0660, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0419, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0534, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0744, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0582, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0552, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0011, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0467, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0398, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0481, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0421, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0538, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0633, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0490, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0406, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0425, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0665, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0523, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0446, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0425, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0423, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0401, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0482, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0559, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0419, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0845, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0413, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0511, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0411, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0490, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0484, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0495, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0394, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0440, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0382, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0408, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0477, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0467, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0431, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0623, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0471, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0393, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0415, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0386, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0470, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0612, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0464, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0421, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0486, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0419, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0476, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0455, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0531, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0491, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0845, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0416, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0655, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0647, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0400, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0506, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0453, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0602, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0869, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0655, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0445, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0392, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0587, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0521, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0398, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0425, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0457, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0457, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0548, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0389, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0433, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0459, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0396, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0437, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0406, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0395, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0464, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0571, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0491, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0414, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0485, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0444, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0393, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0514, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0545, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0478, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0406, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0493, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0488, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0405, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0403, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0444, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0540, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0457, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0466, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0385, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0547, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0611, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0412, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0439, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0533, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0442, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0400, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0408, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0570, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0394, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0393, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0396, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0491, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0391, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0671, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0498, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0456, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0522, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0398, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0634, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0458, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0446, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0546, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0405, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0687, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0496, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0518, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0525, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0586, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0459, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0554, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0397, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0485, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0447, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0406, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0602, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0395, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0385, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0015, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0434, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0379, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0385, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0491, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0467, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0602, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0440, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0498, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0409, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0423, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0451, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0520, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0409, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0400, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0383, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0534, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0397, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0492, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0462, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0459, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0455, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0487, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0512, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0493, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0660, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0412, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0385, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0436, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0440, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0513, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0419, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0447, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0450, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0508, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0384, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0404, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0404, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0393, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0442, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0487, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0388, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0397, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0491, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0625, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0441, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0689, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0508, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0501, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0404, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0476, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0460, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0419, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0435, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0416, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0440, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0419, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0532, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0474, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0409, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0483, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0499, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0450, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0391, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0481, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0400, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0501, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0612, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0389, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0422, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0523, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0697, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0403, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0592, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0379, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0559, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0432, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0396, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0408, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0544, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0516, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0652, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0411, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0432, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0457, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0439, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0430, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0502, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0468, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0472, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0532, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0385, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0433, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0591, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0399, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0599, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0458, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0431, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0464, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0501, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0407, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0417, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0607, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0408, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0444, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0592, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0398, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0659, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0438, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0449, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0624, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0024, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0471, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0637, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0408, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0788, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0408, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0557, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0426, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0403, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0497, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0455, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0788, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0565, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0384, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0618, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0405, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0439, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0446, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0407, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0396, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0454, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0435, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0433, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0399, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0393, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0401, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0658, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0411, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0488, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0592, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0397, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0394, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0409, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0457, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0022, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0408, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0541, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0383, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0490, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0387, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0390, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0409, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0464, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0405, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0434, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0570, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0431, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0484, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0382, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0396, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0386, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0429, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0426, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0544, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0385, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0563, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0422, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0406, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0536, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0450, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0393, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0569, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0480, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0417, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0501, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0397, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0455, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0424, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0438, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0474, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0439, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0001, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0423, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0407, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0443, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0384, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0461, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0432, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0485, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0495, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0021, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0546, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0390, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0385, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0010, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0495, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0433, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0395, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0568, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0383, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0397, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0382, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0436, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0402, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0440, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0410, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0613, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0458, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0384, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0448, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0401, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0524, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0419, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0457, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0505, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0400, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0394, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0385, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0386, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0400, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0405, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0443, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0465, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0402, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0459, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0445, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0563, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0484, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0445, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0022, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0502, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0661, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0567, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0419, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0386, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0019, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0490, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0530, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0423, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0412, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0418, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0407, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0390, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0462, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0428, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0440, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0400, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0433, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0510, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0405, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0642, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0470, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0426, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0387, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0401, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0379, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0407, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0386, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0389, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0436, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0379, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0489, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0424, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0444, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0391, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0460, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0397, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0384, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0558, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0389, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0459, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0491, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0516, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0444, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0864, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0024, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0481, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0534, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0549, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0459, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0382, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0412, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0535, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0453, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0392, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0015, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0412, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0499, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0443, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0447, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0422, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0399, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0445, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0379, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0450, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0017, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0604, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0438, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0409, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0470, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0012, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0430, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0434, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0455, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0479, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0490, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0409, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0386, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0023, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0384, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0383, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0491, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0382, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0421, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0513, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0430, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0017, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0467, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0015, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0382, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0584, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0397, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0407, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0499, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0008, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0395, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0470, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0507, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0379, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0395, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0415, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0426, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0516, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0667, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0013, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0399, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0409, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0488, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0459, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0409, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0496, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0814, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0470, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0426, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0012, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0017, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0414, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0016, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0428, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0025, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0392, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0410, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0565, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0398, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0400, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0513, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0023, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0626, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0012, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0025, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0002, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0007, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0505, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0023, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0387, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0470, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0537, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0463, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0662, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0434, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0022, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0024, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0005, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0023, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0023, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0454, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0437, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0399, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0021, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0398, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0612, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0439, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0019, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0008, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0018, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0011, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0390, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0469, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0569, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0023, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0521, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0391, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0515, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0420, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0450, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0408, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0439, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0492, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0018, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0438, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0412, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0014, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0015, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0408, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0003, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0463, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0412, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0020, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0022, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0009, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0022, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0498, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0016, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0021, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0007, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0010, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0379, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0024, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0010, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0025, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0444, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0509, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0009, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0434, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0004, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0011, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0021, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(2.1157e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0011, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0019, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0017, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0413, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0015, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0025, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0018, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0007, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0430, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0010, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0410, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0395, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0439, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0021, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0019, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0017, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0461, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0020, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0406, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0013, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0020, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0384, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0438, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0007, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0022, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0014, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0004, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0009, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0021, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0020, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0024, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0007, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0416, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0428, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0021, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0012, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0014, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0002, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0009, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0015, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0020, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0023, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0412, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0478, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0006, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0398, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0024, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0022, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0395, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0009, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0393, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0004, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0020, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0019, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0020, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0449, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0383, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0413, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0025, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0024, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0402, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0023, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0020, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0023, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0010, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0410, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0018, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0017, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0445, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0008, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0025, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0017, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0024, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0025, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0014, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0409, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0009, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0022, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0024, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0025, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0021, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0019, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0022, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0024, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0529, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0024, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0013, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0018, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0025, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0387, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0024, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0439, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0016, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0018, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0012, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0003, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0384, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0017, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0450, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0018, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0387, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0551, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0431, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0383, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0013, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0016, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0019, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0621, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0018, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0644, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0016, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0507, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0400, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0019, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0018, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0502, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0018, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0709, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0016, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0451, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0008, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0023, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0022, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0517, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0524, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0017, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0399, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0014, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0014, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0020, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0025, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0022, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0429, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0002, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0018, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0397, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0002, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0448, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0499, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0416, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0407, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0018, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0020, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0021, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0017, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0545, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0014, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0456, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0007, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0405, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0025, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0009, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0620, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0018, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0022, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0014, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0397, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0532, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0006, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0013, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0519, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0024, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0505, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0004, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0021, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0405, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0008, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0019, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0009, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0013, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0023, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0496, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0017, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0008, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0454, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0418, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0439, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0458, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0460, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0013, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0017, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0011, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0021, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0399, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0017, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0021, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0014, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0025, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0436, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0499, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0013, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0508, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0382, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0434, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0009, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0386, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0021, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0481, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0430, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0395, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0008, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0394, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0016, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0025, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0023, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0018, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0431, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0006, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0023, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0021, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0013, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0016, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0021, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0404, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0012, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0463, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0018, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0025, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0019, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0015, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0395, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0417, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0010, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0392, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0020, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0019, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0015, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0025, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0021, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0015, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0023, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0025, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0025, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0017, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0557, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0545, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0471, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0005, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0455, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0021, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0463, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0502, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0560, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0017, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0022, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0018, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0025, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0502, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0024, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0015, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0008, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0020, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0425, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0013, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0013, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0010, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0007, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0018, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0010, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0005, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0009, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0405, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0023, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0525, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0015, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0023, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0017, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0019, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0006, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0475, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0016, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0006, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0017, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0025, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0008, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0024, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0401, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0020, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0024, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0011, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0013, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0018, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0012, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0008, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0423, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0485, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0020, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0012, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0409, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0435, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0435, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0395, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0023, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0018, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0023, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0003, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0013, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0015, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0012, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0020, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0007, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0023, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0023, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0023, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0472, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0019, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0022, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0013, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0025, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0419, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0024, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0007, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0022, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0020, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0013, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0018, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0025, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0021, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0013, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0017, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0013, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0425, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0010, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0025, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0013, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0018, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0007, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0020, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0019, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0018, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0023, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0016, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0022, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0019, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0015, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0023, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0005, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0005, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0544, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0419, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0022, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0402, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0401, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0394, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0012, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0015, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0543, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0471, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0387, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0019, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0398, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0423, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0008, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0379, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0447, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0410, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0024, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0024, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0624, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0020, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0440, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0465, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0473, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0496, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0397, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0402, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0384, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0018, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0015, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0018, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0512, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0476, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0489, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0023, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0391, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0505, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0466, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0020, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0387, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0009, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0399, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0382, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0489, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0021, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0406, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0020, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0401, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0440, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0515, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0019, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0016, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0422, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0007, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0445, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0384, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0440, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0454, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0015, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0399, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0007, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0385, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0016, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0017, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0441, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0025, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0023, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0015, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0023, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0021, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0452, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0017, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0477, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0400, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0015, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0005, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0008, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0392, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0427, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0513, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0471, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0414, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0024, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0385, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0019, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0476, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0462, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0398, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0834, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0021, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0387, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0409, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0410, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0021, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0404, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0022, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0016, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0025, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0408, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0665, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0393, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0442, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0492, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.1032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0392, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0489, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0023, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0429, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0013, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0022, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0009, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0404, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0022, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0477, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0025, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0024, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0015, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0024, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0572, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0442, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0006, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0401, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0020, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0388, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0398, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0418, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0020, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0016, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0025, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0007, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0020, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0019, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0450, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0632, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0016, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0012, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0013, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0422, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0010, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0463, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0020, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0022, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0018, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0020, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0021, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0615, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0018, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0021, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0400, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0463, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0022, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0384, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0025, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0018, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0020, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0019, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0585, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0018, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0025, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0430, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0389, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0020, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0019, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0016, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0399, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0385, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0022, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0018, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0453, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0024, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0020, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0012, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0527, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0018, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0025, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0012, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0022, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0007, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0012, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0390, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0011, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0015, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0391, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0022, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0420, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0430, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0019, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0406, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0016, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0013, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0009, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0018, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0025, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0019, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0024, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0410, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0438, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0018, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0014, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0446, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0017, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0003, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0016, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0006, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0444, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0015, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(1.7367e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0023, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0023, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0003, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0383, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0018, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0014, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0011, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0022, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0020, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0022, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0010, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0427, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0405, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0011, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0020, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0022, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0009, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0020, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0023, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0017, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0025, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(1.8342e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(2.0433e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0023, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0003, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0025, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0005, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0023, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0024, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0011, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0019, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0382, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0006, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0007, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0014, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0025, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0024, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(1.7221e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0021, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0025, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0012, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0003, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(2.6202e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0005, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0009, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0009, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0439, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0021, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0007, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0010, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0020, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0016, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(1.8137e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0017, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0015, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0016, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0020, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0015, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(1.9006e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(1.7996e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0010, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0005, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0460, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0012, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0016, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0003, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0397, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0004, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0021, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0022, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(1.9650e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(1.9409e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0013, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0009, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0004, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0021, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0013, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0004, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0024, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0013, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0018, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0020, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0010, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0021, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(2.0812e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0014, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(2.1688e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(2.8376e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0017, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0025, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0014, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0002, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(1.8133e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(1.7260e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0008, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0012, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0451, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0025, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0407, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0024, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0018, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0394, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0007, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0017, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(1.8852e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0005, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(1.8421e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0016, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0001, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(1.8940e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0004, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0017, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0019, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0019, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0485, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0006, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0429, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0014, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(2.3388e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0009, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0025, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0022, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0383, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0018, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0014, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0001, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(1.8821e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(1.9333e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0003, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0011, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(6.0161e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(1.8146e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0001, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(2.1102e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0014, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0013, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0002, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0411, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0023, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0399, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0022, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0017, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0017, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0016, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0018, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0022, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0025, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0019, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0013, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0025, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0021, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0018, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0683, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0016, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0018, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0020, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0022, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0006, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0023, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0005, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(1.9183e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0004, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0024, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0005, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0016, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0006, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0011, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0023, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(1.8324e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(1.7716e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0023, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0436, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0011, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0015, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0022, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(8.4990e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0020, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0013, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0001, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0007, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0017, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0024, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0010, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0021, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0025, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0024, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0010, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0016, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0021, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0400, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0020, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0016, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0012, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0024, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0457, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0019, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0024, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0022, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0014, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0014, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0025, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0020, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0016, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0016, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0012, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0410, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0003, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0018, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0008, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0020, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0528, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0022, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0020, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0014, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0389, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0016, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0010, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0019, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0007, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0025, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0007, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0008, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0024, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0395, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0014, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0012, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0018, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0016, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0020, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0025, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0020, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0025, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0025, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0397, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0023, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0025, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0017, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0010, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0025, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0013, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0019, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0421, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0020, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0015, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0007, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0019, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0006, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0018, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0023, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0021, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0416, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0023, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0013, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0018, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0560, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0019, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0010, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0008, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0418, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0025, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0019, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0021, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0013, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0008, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0012, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(9.0590e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0002, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0415, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0017, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0023, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0020, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0017, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0007, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0008, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0016, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0006, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0020, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0023, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0020, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0007, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0387, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0024, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0514, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0411, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0024, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0012, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0011, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0018, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0018, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0017, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0023, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0446, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0005, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0005, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0023, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0012, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(2.7582e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0008, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0025, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0019, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0010, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0712, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0415, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0007, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0021, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0022, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0015, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0647, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0019, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0440, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0019, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0019, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0559, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0411, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0005, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0021, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0003, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0020, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0006, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0018, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0011, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0022, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0420, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0021, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0025, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0022, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0452, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0471, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0007, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0012, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0016, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(7.4946e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0016, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0002, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(1.9252e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(1.6502e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0010, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0013, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(8.7682e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0016, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0453, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0004, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0443, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0011, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0479, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0002, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0013, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0009, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0444, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0010, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0387, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0379, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0023, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0007, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0014, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(2.0728e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0024, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0024, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0019, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0519, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0022, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0021, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0012, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0010, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0021, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0024, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0007, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0004, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0010, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0020, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(1.8513e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0011, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0013, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0025, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0016, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0014, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0460, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0442, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0510, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0018, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0020, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0559, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0014, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0007, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0014, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0025, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0021, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0023, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0482, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0397, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0014, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0009, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0465, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0015, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0009, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0527, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0384, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0021, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0013, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0442, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0021, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0534, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0024, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(2.1592e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0498, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0449, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0025, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0024, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(4.2363e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0554, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0460, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0024, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0020, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0021, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0024, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0018, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0021, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(1.7248e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0553, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0007, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0023, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(5.3339e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0003, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0007, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0424, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0019, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0019, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0022, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0013, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0395, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0020, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0405, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0015, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0018, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0015, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0392, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0015, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0012, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0013, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0023, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0022, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0444, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0016, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0017, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0022, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0007, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0003, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0021, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0020, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0022, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0431, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0022, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0017, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0020, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0007, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0025, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(1.9058e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0009, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0006, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0477, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0019, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0430, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0025, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0024, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0012, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0022, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0024, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0025, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0023, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0609, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0020, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0022, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0007, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0020, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0002, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0015, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0009, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0473, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0404, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0013, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0018, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0008, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0003, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0004, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0478, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0384, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0442, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(1.9499e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0017, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(6.5341e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0388, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0481, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0023, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0412, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0021, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0025, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0018, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0021, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0393, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0013, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0013, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0012, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0382, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0023, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0023, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0025, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0018, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0016, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0432, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0022, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0019, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0509, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0011, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0020, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0020, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0025, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0466, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0019, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0435, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0025, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0002, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0490, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0449, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0021, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0495, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0018, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0420, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0386, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0016, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0016, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0018, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0463, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0013, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0018, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0487, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0022, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0019, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0020, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0016, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0402, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0017, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0020, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0395, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0387, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0018, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0467, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0024, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0012, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0397, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0405, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0010, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0020, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0020, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0016, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0024, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0396, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0339, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0021, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0479, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0392, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0024, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0009, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0025, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0435, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0020, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0432, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0020, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0020, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0015, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0405, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0025, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0732, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0008, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0023, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0025, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0453, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0023, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0024, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0024, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0012, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0012, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0012, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0614, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0408, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0004, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0012, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0024, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0017, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0023, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0396, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0018, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0014, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0007, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0015, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0021, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0016, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0020, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0018, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0014, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0007, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0018, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0017, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0473, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0394, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0622, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0019, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0024, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0006, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0015, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0019, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0023, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0712, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0022, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0022, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0024, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0384, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0015, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0386, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0385, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0023, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0022, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0024, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0021, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0387, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0016, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0022, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0439, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0405, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0004, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0420, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0021, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0417, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0402, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0025, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0001, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0016, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0424, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0544, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0514, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0442, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0023, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0020, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0392, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0385, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0024, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0025, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0392, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0003, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0011, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0024, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0024, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0386, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0013, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0411, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0431, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0014, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0021, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0016, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0024, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0367, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0427, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0020, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0019, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0025, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0456, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0023, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0009, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0021, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0011, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0017, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0404, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0017, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0018, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0017, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0452, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0025, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0025, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0011, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0431, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0010, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0016, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0390, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0014, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0525, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0025, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0021, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0024, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0391, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0009, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0014, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0018, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0015, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0011, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0023, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0025, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0022, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0393, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0006, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0023, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0023, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0500, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0393, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0023, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0025, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0406, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0010, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0023, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0017, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0025, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0388, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0436, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0406, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0022, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0436, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0433, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0015, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0023, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0415, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0024, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0022, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0439, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0004, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0480, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0417, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0021, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0422, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0435, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0398, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0007, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0019, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0025, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0016, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0408, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0400, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0025, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0021, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0020, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0024, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0024, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0383, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0009, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0014, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0024, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0025, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0015, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0594, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0015, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0009, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0021, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0400, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0013, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0024, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0024, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0007, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0379, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0019, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0019, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0024, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0440, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0019, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0391, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0006, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0394, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0022, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0574, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0019, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0448, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0020, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0383, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0441, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0013, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0019, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0506, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0487, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0023, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0025, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0513, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0001, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0568, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0709, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0024, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0412, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0005, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0022, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0010, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0530, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0400, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0489, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0399, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0381, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0018, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0459, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0019, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0012, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0015, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0011, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0017, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0023, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0021, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0025, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0390, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0019, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0393, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0588, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0021, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0005, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0407, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0395, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0007, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0024, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0006, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0012, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0022, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0447, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0020, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0013, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0010, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0413, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0023, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0011, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0016, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0383, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0023, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0008, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0017, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0015, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0021, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0023, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0014, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0021, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0019, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0011, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0385, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0398, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0628, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0023, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0489, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0507, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0403, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0022, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0007, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0001, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0612, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0018, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0009, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0022, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0011, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0393, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0015, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0445, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0013, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0008, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0022, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0014, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0010, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0023, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0383, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0019, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0455, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0025, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0020, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0414, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0022, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0024, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0023, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0460, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0012, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0351, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0022, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0006, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0023, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0025, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0023, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0012, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0021, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0013, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0015, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0019, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0022, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0004, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0007, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0022, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0021, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0023, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0440, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0020, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0025, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0023, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0012, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0602, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0474, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0448, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0021, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0020, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0414, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0410, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0472, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0431, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0017, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0448, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0426, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0023, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0020, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0016, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0452, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0375, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0373, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0019, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0007, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0005, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0499, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0022, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0759, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0020, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0555, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0025, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0409, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0488, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0013, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0019, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0548, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0013, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0010, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0458, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0017, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0025, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0016, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0025, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0020, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0019, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0023, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0019, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0607, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0022, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0392, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0013, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0465, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0454, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0018, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0010, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0406, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0009, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0021, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0018, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0017, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0010, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0023, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0015, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0011, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0015, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0025, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0024, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0363, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0416, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0024, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0022, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0013, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0002, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0002, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0018, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0005, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0019, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0025, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0023, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0015, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0022, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0004, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0512, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0465, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0022, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0003, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0024, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0018, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0018, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0397, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0441, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0503, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0409, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0017, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0010, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0020, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0025, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0008, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0022, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0019, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0012, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0011, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0025, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0014, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0018, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0022, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0549, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0021, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0022, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0412, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0023, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0386, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0008, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0416, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0020, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0007, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0014, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0011, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(1.2399e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0014, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0017, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0499, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0025, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0022, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0440, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0008, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0012, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0477, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0018, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0012, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0009, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0021, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0022, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0023, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0013, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0017, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0023, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0025, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0386, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0025, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0372, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0015, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0019, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0016, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0023, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0014, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0004, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0023, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0019, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0001, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0006, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0008, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0013, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0001, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0007, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0479, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0015, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0021, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0398, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0015, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(5.4373e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0011, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0006, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0024, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0014, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0020, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0002, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0016, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0398, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0016, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0502, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0021, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0611, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0025, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0019, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0025, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(1.3044e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0023, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0019, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0016, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0021, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0422, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0011, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0023, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0024, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0427, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(4.4037e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0001, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0005, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0006, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0012, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0004, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0003, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0009, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0019, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(7.8339e-06, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0011, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0482, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0007, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(6.8108e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0010, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0006, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0009, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0398, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0021, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0018, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0013, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0468, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0023, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0009, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0021, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0010, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0025, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0015, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0389, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0023, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0025, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0021, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0403, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0010, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0020, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0020, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0009, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0024, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0384, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0011, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0013, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0012, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0390, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0008, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0012, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0402, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0009, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0014, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0024, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0014, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0421, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0014, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0007, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0016, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0426, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0531, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0552, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0021, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(3.5747e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0010, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0019, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0003, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(3.9073e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0019, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0014, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0004, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0519, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0017, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0487, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0018, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0020, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0016, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0019, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0020, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0387, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0019, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0001, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0020, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0021, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0014, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0009, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0005, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0024, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0020, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0005, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0006, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(1.6752e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0004, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0008, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0002, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0013, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0001, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0005, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0406, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0003, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0507, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0024, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0018, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0009, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0018, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0538, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0457, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0388, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(1.5908e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0018, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0022, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0007, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0405, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0016, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0023, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0021, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0025, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0421, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0006, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0018, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0003, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0025, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0015, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0021, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0021, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0010, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0017, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0013, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0017, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0002, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0001, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0006, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(8.2190e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0020, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0017, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0002, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(8.2631e-06, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0023, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0006, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0024, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0019, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0020, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0008, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0002, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0021, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0014, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0013, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(1.0986e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0005, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0012, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0012, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0015, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(1.6269e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0002, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0007, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(1.1933e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0021, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0018, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0021, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0015, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0021, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0020, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0002, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(2.3167e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0007, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0004, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(1.0789e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0016, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0006, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0017, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0002, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0004, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0008, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0010, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0020, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0009, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(4.7135e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0008, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0010, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0022, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0004, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(3.2939e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(7.9564e-06, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(7.9564e-06, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(7.9530e-06, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0021, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0013, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0023, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0022, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0018, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(1.3882e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0019, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0015, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0019, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0017, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0439, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0024, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0015, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0015, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0012, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0021, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0011, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0013, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0013, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0019, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(1.3282e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0010, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0009, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0014, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0005, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0016, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0012, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0024, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0025, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0021, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0002, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(1.7168e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0018, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(8.4542e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0016, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0017, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0006, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0022, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0021, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0003, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(5.1530e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0021, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(1.3281e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0014, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0014, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0024, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0022, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0007, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0024, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0007, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0022, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0010, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0004, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(7.3444e-06, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0009, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(3.3322e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0009, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0020, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0005, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0014, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0005, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0001, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0010, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(1.6977e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0019, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0025, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0013, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0007, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0008, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0003, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(9.9704e-06, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0009, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(1.4447e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0024, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0015, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(1.8356e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0002, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(4.3105e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0007, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0023, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0011, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0003, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0009, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0009, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0004, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(6.9626e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0016, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0010, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0008, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0010, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0022, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0009, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(1.7393e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0003, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0002, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0016, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0019, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0018, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0017, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0012, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(9.2013e-06, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(6.5474e-06, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0002, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(7.7368e-06, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(6.7791e-06, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0009, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(9.4522e-06, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(3.4648e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0014, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0025, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0004, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0014, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0003, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(7.1510e-06, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0435, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0025, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(9.8803e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(9.8581e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(9.8581e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(9.8581e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(9.8167e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(9.8167e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(9.8167e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(9.7580e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(9.7580e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(9.7580e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(9.6846e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(9.6846e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(9.6846e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0001, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0021, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0019, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0023, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0025, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0020, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0025, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0024, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0012, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0011, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0023, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0024, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0435, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0012, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0022, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0415, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0016, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0020, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0018, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0017, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0023, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0009, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0016, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0021, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0006, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0011, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0017, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0002, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0016, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0013, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0009, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0012, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0015, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0009, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0008, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0022, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0014, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0016, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0019, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(2.8760e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0002, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0006, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0021, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0018, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0002, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0018, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0020, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0014, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0002, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0012, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0002, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0005, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0017, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0015, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0008, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0018, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0005, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0019, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0014, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0023, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0023, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0017, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0003, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0006, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0012, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0019, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0005, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0015, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0003, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0011, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0001, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0002, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0020, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0016, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0005, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0002, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0003, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0010, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0024, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0019, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0019, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0010, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(5.3051e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0019, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0009, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0435, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0020, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0023, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0003, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0012, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0009, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0425, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0003, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0022, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0008, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(8.4729e-06, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0003, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0009, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(7.2310e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0010, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(2.9067e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0018, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(7.3012e-06, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0016, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0006, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0007, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0006, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0004, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0007, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0005, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0007, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(2.5761e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0002, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0018, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0019, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0002, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0008, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0005, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0021, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0022, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0422, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0016, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0023, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0024, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0021, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0023, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0006, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0004, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0319, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0421, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0414, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0013, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0411, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0015, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0023, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0021, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0017, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0020, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0012, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0017, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0412, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0003, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0416, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0022, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0006, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0019, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0022, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0001, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0017, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0014, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0016, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0019, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0001, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0019, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0014, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0020, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0017, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0374, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0016, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0007, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0343, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0623, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0009, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0021, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0020, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0023, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0018, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0021, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0021, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0012, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0016, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0010, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0014, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0015, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0007, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0016, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0623, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0016, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0016, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0018, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0388, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0025, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0020, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0368, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0022, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0022, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0025, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0393, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0502, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0022, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0442, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0004, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0017, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0014, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0024, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0010, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0015, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0019, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0013, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0007, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0024, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0009, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0021, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0021, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0360, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0013, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0022, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0025, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0378, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0024, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0019, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0388, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0016, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0019, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0391, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0022, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0408, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0413, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0320, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0017, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0021, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0400, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0014, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0019, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0021, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0024, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0019, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0018, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0020, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0414, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0016, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0508, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0014, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0022, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0018, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0391, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0318, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0019, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0300, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0432, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0013, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0022, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0021, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0022, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0386, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0418, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0020, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0025, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0473, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0398, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0020, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0018, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0021, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0018, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0448, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0607, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0008, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0422, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0377, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0020, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0015, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0018, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0024, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0024, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0439, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0012, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0014, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0017, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0015, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0022, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0016, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0513, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0009, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0016, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0387, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0021, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0022, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0002, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0010, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0497, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0008, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0022, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0024, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0023, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0009, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0371, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0383, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0011, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0005, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0337, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0014, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0313, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0014, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0384, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0002, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0439, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0010, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0005, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0017, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0017, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0732, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0311, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0013, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0011, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0014, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0019, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0005, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0023, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0011, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0025, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0015, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0022, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0010, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0414, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0021, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0560, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0002, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0007, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0012, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0016, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0013, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0336, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0020, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0021, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0020, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0362, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0356, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0025, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0389, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0010, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0011, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0020, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0008, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0441, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0025, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0019, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0010, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0018, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0018, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0437, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0015, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0286, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0018, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0018, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0011, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0021, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0025, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0017, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0017, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0023, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0025, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0015, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0020, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0422, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0018, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0361, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0309, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0333, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0004, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0025, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0009, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(3.2318e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0007, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0269, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0265, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0021, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0017, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0003, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0024, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0355, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0016, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0023, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0447, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0389, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0321, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0009, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0382, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0425, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0012, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0387, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0020, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0020, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0391, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0503, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0022, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0013, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0352, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0024, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0020, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0007, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0016, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0013, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0012, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0020, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0020, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0004, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0022, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0379, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0024, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0289, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0512, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0305, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0479, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0019, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0012, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0025, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0017, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0326, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0414, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0019, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0009, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0025, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0020, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0009, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0409, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0346, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0382, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0439, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0016, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0341, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0025, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0021, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0007, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0421, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0016, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0019, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0022, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0024, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0012, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0012, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0021, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0260, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0413, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0024, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0015, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0008, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0322, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0015, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0017, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0274, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0324, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0022, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0010, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0019, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0348, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0472, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0008, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0024, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0379, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0538, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0015, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0023, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0254, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0023, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0024, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0017, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0247, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0023, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0025, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0018, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0024, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0020, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0010, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0025, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0022, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0025, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0019, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0014, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0005, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0020, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0023, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0430, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0010, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0270, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0017, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0330, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0022, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0019, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0015, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0446, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0019, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0011, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0024, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0386, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0019, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0275, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0022, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0020, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0023, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0023, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0024, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0013, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0013, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0020, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0011, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0354, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0013, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0279, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0338, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0013, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0017, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0018, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0014, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0353, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0558, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0020, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0237, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0017, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0023, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0021, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0010, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0424, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0323, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0019, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0023, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0293, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0328, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0867, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0024, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0245, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0332, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0222, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0025, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0003, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0315, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0020, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0016, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0019, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0400, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0357, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0314, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0015, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0025, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0350, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0021, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0020, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0012, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0022, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0024, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0016, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0012, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0008, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0024, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0552, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0312, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0021, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0014, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0349, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0013, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0256, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0295, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0018, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0016, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0023, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0011, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0024, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0456, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0022, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0364, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0023, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0021, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0272, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0288, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0014, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0220, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0425, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0018, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0005, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0331, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0303, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0006, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0020, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0014, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0013, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0302, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0012, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0019, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0248, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0400, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0002, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0335, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0250, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0006, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0014, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0019, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0011, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0021, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0208, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0221, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0017, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0014, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0401, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0327, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0282, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0380, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0020, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0019, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0009, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0021, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0329, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0359, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0345, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0011, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0301, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0012, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0281, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0025, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0238, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0005, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0187, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0283, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0021, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0284, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0211, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0023, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0292, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0308, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0018, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0021, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0317, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0285, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0167, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0025, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0024, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0017, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0015, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0020, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0304, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0009, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0264, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0358, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0131, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0020, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0006, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0011, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0015, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0013, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0020, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0404, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0010, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0004, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0427, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0021, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0017, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0008, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0013, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0242, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0294, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0262, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0015, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0016, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0307, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0252, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0216, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0210, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0020, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0020, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0020, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0369, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0021, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0298, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0190, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0005, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0207, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0189, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0268, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0003, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0014, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0370, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0023, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0017, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0219, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0020, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0578, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0018, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0344, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0340, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0205, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0021, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0017, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0199, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0280, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0166, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0384, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0243, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0008, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0024, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0225, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0018, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0021, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0229, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0366, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0246, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0236, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0017, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0347, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0008, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0160, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0016, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0022, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0006, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0267, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0228, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0425, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0005, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0025, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0316, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0020, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0018, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0306, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0024, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0376, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0223, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0266, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0022, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0159, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0287, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0024, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0127, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0020, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0025, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0008, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0241, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0025, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0277, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0193, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0009, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0164, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0013, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0162, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0088, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0016, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0007, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0203, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0018, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0008, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0178, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0194, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0018, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0278, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0025, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0014, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0253, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0128, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0139, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0025, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0213, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0019, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0244, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0024, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0157, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0012, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0010, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0022, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0025, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0020, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0001, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0154, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0016, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0104, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0255, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0276, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0017, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0198, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0258, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0218, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0118, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0017, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0008, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0024, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0173, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0023, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0009, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0291, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0179, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0021, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0023, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0018, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0025, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0015, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0014, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0007, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0011, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(1.5616e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(5.2138e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0019, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0180, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(9.5604e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0005, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0016, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0023, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0271, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0182, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0290, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0296, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0431, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0021, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0023, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0196, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0151, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0263, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0233, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0224, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0212, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0197, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0110, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0186, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0169, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0117, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0004, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0184, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0007, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0020, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0001, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0019, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0017, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0019, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0023, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0014, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0232, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0116, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0257, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0334, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0251, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0325, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0147, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0214, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0013, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0175, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0010, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0022, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0017, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0209, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0133, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0013, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0047, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0006, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0011, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(5.9801e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0007, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0025, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0025, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0001, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0012, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0022, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0024, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0003, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0005, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0227, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0119, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0005, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0156, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0015, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0016, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0123, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0120, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0259, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0109, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0021, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0202, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0261, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0012, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0014, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0015, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0191, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0020, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0195, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0113, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0019, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0132, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0114, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0153, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0155, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0185, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0080, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0137, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0152, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0234, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0046, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0016, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0299, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0165, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0239, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0033, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0107, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0064, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0019, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0297, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0007, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0386, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0016, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0174, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0097, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0150, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0024, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0204, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0103, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0342, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0226, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0217, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0310, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0148, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0171, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0144, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0059, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0010, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0055, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0141, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0014, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0129, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0022, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0024, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0025, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0093, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0002, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0168, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0122, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0062, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0012, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0098, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0105, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0085, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0020, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0017, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0058, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0126, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0176, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0181, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0022, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0012, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0235, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0425, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0048, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0045, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0087, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0037, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0200, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0067, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0009, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0365, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0007, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0115, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0096, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0060, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0183, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0022, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0056, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0108, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0100, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0083, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0106, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0003, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0101, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0054, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0024, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0029, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0092, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0052, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0068, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0014, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0140, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0004, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0020, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0003, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0145, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0053, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0019, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0019, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0024, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0121, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0146, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0138, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0124, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0035, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0089, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0007, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0034, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0091, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0032, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0134, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0192, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0076, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0063, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0038, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0406, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0021, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0149, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0011, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0024, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0069, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0102, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0177, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0135, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0170, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0019, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0158, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0010, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0078, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0030, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0012, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0230, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0188, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0071, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0013, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0077, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0024, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0201, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0112, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0042, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0057, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0041, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0143, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0379, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0090, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0082, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0215, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0070, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0028, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0075, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0073, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0125, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0061, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0026, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0273, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0130, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0006, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0072, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0095, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0240, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0036, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0172, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0142, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0050, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0049, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0009, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0039, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0161, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0249, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0111, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0099, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0163, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0031, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0136, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0051, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0094, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0007, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0081, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0086, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0027, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0043, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0066, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0079, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0206, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0004, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(2.6780e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(2.1865e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0007, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(1.6953e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(1.2681e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(1.3328e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0003, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0006, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0003, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(1.2576e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0001, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(1.8190e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(1.4044e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0003, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0006, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(1.7583e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0001, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(2.4175e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0001, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(1.4264e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0044, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(3.3377e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(1.3767e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0001, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0016, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(1.7550e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0040, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(1.8194e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(2.0903e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(1.4703e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(1.8213e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(1.2976e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0084, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(1.6335e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(1.3528e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(1.9564e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0003, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(1.3229e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(1.3206e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(1.4855e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(1.3445e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0001, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(1.6481e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(1.4360e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(1.6216e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(2.2091e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0074, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(1.6330e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(1.8079e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(1.3983e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(1.3996e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0025, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(1.2935e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0001, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(3.4532e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(3.1278e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0001, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(2.4104e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(1.2860e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(1.9345e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0005, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(2.0239e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(3.0542e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(1.6027e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(1.6008e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0021, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0007, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(1.4775e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0231, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(1.2734e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(1.2302e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(1.4938e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(1.5832e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0008, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(1.2172e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(1.6169e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(3.2703e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0006, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(1.4283e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0003, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(2.5021e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(1.2434e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(1.5547e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(1.6778e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(1.4304e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(1.5563e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(1.3014e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0001, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0003, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(2.6168e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(1.2910e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(2.0657e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0003, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(1.2395e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(1.2498e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0001, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0003, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0003, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0065, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0007, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(1.8523e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0001, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0002, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(1.4188e-05, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0001, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0001, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0021, device='cuda:0', grad_fn=<NllLossBackward>)
Loss: tensor(0.0001, device='cuda:0', grad_fn=<NllLossBackward>)
